Xilinx generates many pretty text files, among the most useful are the fitter (which shows you the utilization) and the timing reports.
Here's what the fitter says, with a few extra features to support reading/writing a 4 bit LED register and driving them on the board. The design is optimized for speed, which means it has flattened out some logic to get faster pin-to-pin delays at the expense of space. It's not clear that that's buying anything, since it can't flatten everything and the design is only as fast as the slowest signals. At one point I had a 17 bit performance counter (accessible via the memory window) and it still fit (after the exhaustive fitting option backed down on the pterm limit a few times).
cpldfit: version G.31a Xilinx Inc. Fitter Report Design Name: pci Date: 9-22-2004, 1:05AM Device Used: XC95108-7-PC84 Fitting Status: Successful **************************** Resource Summary **************************** Macrocells Product Terms Registers Pins Function Block Used Used Used Used Inputs Used 72 /108 ( 67%) 372 /540 ( 69%) 53 /108 ( 49%) 50 /69 ( 72%) 193/216 ( 89%) PIN RESOURCES: Signal Type Required Mapped | Pin Type Used Remaining ------------------------------------|--------------------------------------- Input : 8 8 | I/O : 44 19 Output : 11 11 | GCK/IO : 3 0 Bidirectional : 30 30 | GTS/IO : 2 0 GCK : 1 1 | GSR/IO : 1 0 GTS : 0 0 | GSR : 0 0 | ---- ---- Total 50 50
I decided to leave out the XC95108 timing report since it's a pack of lies anyway. I told the WebPACK that I had a -7 speed grade part, which produces numbers that don't look so scary, even though in that best case they violate the PCI spec by a fair margin. Luckily there is a lot of slack in the PCI spec if you put your device very close to the master and it isn't a stickler for the setup times defined by the spec. Also, my -20 speed grade part was probably binned down for marketing reasons.
Here's what it says for a XC95144XL-7, which is what I'd use if I were making another board. It actually squeaks the clk-to-pad delays out at under 13ns (that's 10ns for propagation delay and 7ns for setup time, according to the PCI spec).
Performance Summary Report -------------------------- Design: pci Device: XC95144XL-7-TQ100 Speed File: Version 3.0 Program: Timing Report Generator: version G.31a Date: Thu Sep 23 00:20:41 2004 Timing Constraint Summary: TS_clk=PERIOD:clk:30.000nS:HIGH:50.000000% Met TS_P2P=FROM:PADS:TO:PADS:23.000nS Met AUTO_TS_2001=OFFSET OUT:23.000nS:AFTER:clk Met Performance Summary: Clock net 'clk' path delays: Clock Pad to Output Pad (tCO) : 12.7ns (2 macrocell levels) Clock Pad 'clk' to Output Pad 'ad<29>' (GCK) Clock to Setup (tCYC) : 15.8ns (2 macrocell levels) Clock to Q, net 'baseaddr<0>.Q' to TFF Setup(D) at 'state<0>.D' (GCK) Target FF drives output net 'state<0>' Setup to Clock at the Pad (tSU) : 12.6ns (1 macrocell levels) Data signal 'cbe<3>' to TFF D input Pin at 'state<0>.D' Clock pad 'clk' (GCK) Minimum Clock Period: 15.8ns Maximum Internal Clock Speed: 63.2Mhz (Limited by Cycle Time)
For best results let the fitter run without pin constraints on a few design variations to see how it likes to divide up the work. Then mimick that to the extent you can when laying out the board and assigning pins manually. I gave top priority to easy board routing (except for GSR and GCK obviously) and I think that put unnecessary strain on the fitter to keep the outputs where I constrained them. My usage isn't well balanced.
Future Board Ideas
- Use a XC95144XL-7 in TQFP 100. It will require a 3.3V regulator but is tolerant of 5V IO, making it possible to key the card for universal operation. That part also stands a chance of meeting the PCI timing requirements without fudging, and has enough extra space to include more features or a more complex backside interface. It's much cheaper new than a XC95108, which is becoming obsolete, but is in a harder-to-solder TQFP 100.
- Leave the other GCK pins out when routing the PCI card edge so that they can be used as part of another interface (like SPI) to another chip, like a Microchip PIC.
- Include some secondary chip like a PIC that can do real processing or other IO and then interface to the PCI bus with the CPLD.
- Put some kind of secondary clock on the board on one of the global clock nets. A divide-by-33-million counter to get a one second timer is a lot of chip space. Much easier to have a little 555 timer or watch crystal to do slow processing.