高速串行接口与GTXE_COMMON / GTXE

高速串行接口与GTXE_COMMON / GTXE_CHANNEL 问题汇总

FPGA小师兄 | 2023-01-15 11:19:44 阅读：4024

在调试FPGA的GTX接口时，经常会遇到报GTXE_COMMON / GTXE_CHANNEL的问题，下面将该问题出现的原因及解决方案介绍一下，本文提供两种不同场景下的解决方案。

RocketIO简介

随着信息技术的快速发展，板卡、芯片之间的数据传输速率越来越高。由于受到物理限制，采用并行的方式无法实现远距离高速数据传输，因此串行传输目前已成为主流的高速数据传输的方式。要想实现高速数据流的串行传输，必不可少的一个核心器件是SerDes（SERializer/DESerializer，其核心功能是实现数据的串并\并串转换。

RocketIO的种类和对应的速率, 最高速率（Gbps/s）

MTG: 6.5

GTP: 6.6, 3.75, 3.2

GTX: 12.5, 6.6, 6.5

GTH: 16.3, 16, 13.1, 11

GTZ: 28.05

GTY: 32.75, 30.5

Xilinx公司在Virtex-2 Pro及更高级的FPGA芯片内集成了SerDes硬核，Xilinx公司将其称为RocketIO模块。在FPGA设计中使用RocketIO模块可以实现片间的高速数据传输。同时也可以以RocketIO为基础，通过在其上层(数据链路层)增加协议，实现不同应用环境下的专用接口，如SGMII（SerialGigabit Media Independent Interface）接口、10G以太网接口、Aurora接口、RapidIO接口等。Xilinx FPGA芯片中集成的RocketIO模块的版本和对应的最大数据传输速率如表2.1所示，其中同一种RocketIO模块在不同的芯片中支持不同的最高速率。

图2 RocketIO GTX/GTH结构图

图2所示为RocketIO GTX/GTH的结构图，从图中可以看出RocketIO分为发送和接收两大部分，每一部分又分为PCS（Physical CodeSublayer）层和PMA（Physical Medium Attachment）层。注意，此处的PCS层和PMA层是GTX/GTH接口的，要注意与前面介绍的10G以太网接口的PCS和PMA区分。见之前文章链接：10G 以太网接口的FPGA实现，你需要的都在这里了。同时，如果GTX/GTH上面跑的是Aurora协议或者是SGMII或者是RapidIO，那么也同样有不同的PCS/PMA（虽然有不同，但其核心都是Rocket IO GTX/GTH的PCS和PMA）。

一、场景1 同一个quad中重复使用GTXE_COMMON的问题

如上图所示，在485T FPGA上同时使用1G SGMII接口和10G接口，单独建1G SGMII接口工程或者单独建10G的工程都可以正常工作，但将两个工程合并，需要同时支持1G SGMII接口和10G的接口时，就会报错，错误提示如下：

[Place 30-140]

Unroutable Placement! A GTXE_COMMON / GTXE_CHANNEL clock component pair is not placed in a routable site pair. The GTXE_COMMON component can use the dedicated path between the GTXE_COMMON and the GTXE_CHANNEL if both are placed in the same clock region. If this sub optimal condition is acceptable for this design, you may use the CLOCK_DEDICATED_ROUTE constraint in the .xdc file to demote this message to a WARNING. However, the use of this override is highly discouraged. These examples can be used directly in the .xdc file to override this clock rule.

< set_property CLOCK_DEDICATED_ROUTE FALSE [get_nets core_wrapper_i/inst/core_gt_common_i/gt0_qplloutclk_out] >core_wrapper_i/inst/core_gt_common_i/gtxe2_common_i (GTXE2_COMMON.QPLLOUTCLK) is provisionally placed by clockplacer on GTXE2_COMMON_X1Y1

core_wrapper_i/inst/pcs_pma_block_i/transceiver_inst/gtwizard_inst/inst/gtwizard_i/gt0_GTWIZARD_i/gtxe2_i (GTXE2_CHANNEL.QPLLCLK) is locked to GTXE2_CHANNEL_X1Y1

The above error could possibly be related to other connected instances. Following is a list of

all the related clock rules and their respective instances.

Clock Rule: rule_bufds_bufg

Status: PASS

Rule Description: A BUFDS driving a BUFG must be placed on the same half side (top/bottom) of the device core_wrapper_i/inst/core_clocking_i/ibufds_gtrefclk (IBUFDS_GTE2.O) is locked to IBUFDS_GTE2_X1Y0 core_wrapper_i/inst/core_clocking_i/bufg_gtrefclk (BUFG.I) is provisionally placed by clockplacer on BUFGCTRL_X0Y2

Clock Rule: rule_bufds_gtxchannel_intelligent_pin

Status: PASS

Rule Description: A BUFDS driving a GTXChannel must both be placed in the same or adjacent clock region (top/bottom)

core_wrapper_i/inst/core_clocking_i/ibufds_gtrefclk (IBUFDS_GTE2.O) is locked to IBUFDS_GTE2_X1Y0

core_wrapper_i/inst/pcs_pma_block_i/transceiver_inst/gtwizard_inst/inst/gtwizard_i/gt0_GTWIZARD_i/gtxe2_i (GTXE2_CHANNEL.GTREFCLK0) is locked to GTXE2_CHANNEL_X1Y1

Clock Rule: rule_bufds_gtxcommon_intelligent_pin

Status: PASS

Rule Description: A BUFDS driving a GTXCommon must both be placed in the same or adjacent clock region(top/bottom)

core_wrapper_i/inst/core_clocking_i/ibufds_gtrefclk (IBUFDS_GTE2.O) is locked to IBUFDS_GTE2_X1Y0 core_wrapper_i/inst/core_gt_common_i/gtxe2_common_i (GTXE2_COMMON.GTREFCLK0) is provisionally placed by clockplacer on GTXE2_COMMON_X1Y1

Clock Rule: rule_gt_bufg

Status: PASS

Rule Description: A GT driving a BUFG must be placed on the same half side (top/bottom) of the device

core_wrapper_i/inst/pcs_pma_block_i/transceiver_inst/gtwizard_inst/inst/gtwizard_i/gt0_GTWIZARD_i/gtxe2_i (GTXE2_CHANNEL.RXOUTCLK) is locked to GTXE2_CHANNEL_X1Y1

and core_wrapper_i/inst/core_clocking_i/rxrecclkbufg (BUFG.I) is provisionally placed by clockplacer on BUFGCTRL_X0Y3

最后问题定位是因为1G SGMII和10G 以太网接口在实例化时，都产生了gtcommon，去掉其中一个就没有上面的错误了。

一个quad只能有一个gtcommon。

二、场景二 AURORA主从模式下跨BANK驱动多个GTX的布线问题

在Aurora片间接口的调试中，一共需要用到两片FPGA，为达到速率的要求，所以Aurora需要采用主从双核，双通道绑定的工作模式，一共需要四个GTX（一个GTX 支持4Gbps的速率，一个Aurora核使用两个GTX，所以本例程中一个Aurora核最高支持8Gbps的速率）来完成片间数据的传输工作。

在工程中，主从双核的例化截图如下图所示（上图为主核，下图为从核）：

主核

从核

由上两图可以发现，主从双核最大的区别在于时钟和复位的逻辑是否包含在例化的IP核中，主核的时钟和复位的逻辑包含在例化的IP核中，而从核是包含在生成的example design中。

在完成Aurora的外部设计后，需要对GTX分配管脚，FPGA1和FPGA2中GTX的管脚如下图所示（上图为FPGA1，下图为FPGA2）：

FPGA1

FPGA2

在之前的设计计划中，可以使用同一个BANK中的四个GTX，但是在实际分配时发现，FPGA1正常，但是FPGA2中的BANK 115只接了三个GTX的管脚，而BANK 116的时钟管脚没有接晶振，考虑到差分时钟可以驱动相邻的BANK，所以采用BANK 115和BANK 116共同使用，在两个BANK中各接两个GTX的方案。

此时在工程在实现时报了如下的错误：

[Place 30140] Unroutable Placement! A GTXE_COMMON / GTXE_CHANNEL clock component pair is not placed in a routable site pair. The GTXE_COMMON component can use the dedicated path between the GTXE_COMMON and the GTXE_CHANNEL if both are placed in the same clock region. If this sub optimal condition is acceptable for this design, you may use the CLOCK_DEDICATED_ROUTE constraint in the .xdc file to demote this message to a WARNING. However, the use of this override is highly discouraged. These examples can be used directly in the .xdc file to override this clock rule.

< set_property CLOCK_DEDICATED_ROUTE FALSE [get_nets core_wrapper_i/inst/core_gt_common_i/gt0_qplloutclk_out] >

core_wrapper_i/inst/core_gt_common_i/gtxe2_common_i (GTXE2_COMMON.QPLLOUTCLK) is provisionally placed by clockplacer on GTXE2_COMMON_X1Y1

core_wrapper_i/inst/pcs_pma_block_i/transceiver_inst/gtwizard_inst/inst/gtwizard_i/gt0_GTWIZARD_i/gtxe2_i (GTXE2_CHANNEL.QPLLCLK) is locked to GTXE2_CHANNEL_X1Y1

The above error could possibly be related to other connected instances. Following is a list of

all the related clock rules and their respective instances.

Clock Rule: rule_bufds_bufg

Status: PASS

Rule Description: A BUFDS driving a BUFG must be placed on the same half side (top/bottom) of the device

core_wrapper_i/inst/core_clocking_i/ibufds_gtrefclk (IBUFDS_GTE2.O) is locked to IBUFDS_GTE2_X1Y0

core_wrapper_i/inst/core_clocking_i/bufg_gtrefclk (BUFG.I) is provisionally placed by clockplacer on BUFGCTRL_X0Y2

Clock Rule: rule_bufds_gtxchannel_intelligent_pin

Status: PASS

Rule Description: A BUFDS driving a GTXChannel must both be placed in the same or adjacent clock region

(top/bottom)

core_wrapper_i/inst/core_clocking_i/ibufds_gtrefclk (IBUFDS_GTE2.O) is locked to IBUFDS_GTE2_X1Y0

core_wrapper_i/inst/pcs_pma_block_i/transceiver_inst/gtwizard_inst/inst/gtwizard_i/gt0_GTWIZARD_i/gtxe2_i (GTXE2_CHANNEL.GTREFCLK0) is locked to GTXE2_CHANNEL_X1Y1

Clock Rule: rule_bufds_gtxcommon_intelligent_pin

Status: PASS

Rule Description: A BUFDS drivin

错误显示GTX的布局出了问题，问题指到了GTX COMMON和GTX CHANNEL上，然后查阅Aurora的用户手册发现了下面的内容：

上图中显示，在外界的差分时钟进来之后需要先经过一个GTX_COMMON模块，在生成QPLLOUTLCK之后接到GTX_CHANNEL之中。这样差分时钟就可以驱动GTX正常工作。

但是如果差分时钟驱动多个GTX时，情况如下图所示：

在图中我们可以发现差分时钟可以最多驱动12个GTX正常工作，但是在驱动时要注意GTX_COMMOM模块，即一个GTX_COMMON最多可以驱动同一个QUAD上的4个GTX_CHANNEL，如果说要驱动超过四个或者其他QUAD上的GTX，必须要生成新的GTX_COMMON模块，来保证其他QUAD上的GTX可以正常工作。

这是检查了一下工程发现了问题，主核的Aurora IP核中包含了GTX_COMMON模块，但是从核GTX_COMMON模块在IP核的外面被注释掉了，因为在之前的设计中，两个IP核的四个GTX在同一个BANK中，所以在从核的顶层将GTX_COMMOM模块删掉了，所以该模块产生的信号（gt_qpllclk_quad1_in_i和gt_qpllrefclk_quad1_in_i）采用了主核产生后外接到从核的工作模式。

发现问题后，将从核的GTX_COMMON模块添加到从核的顶层，并且将gt_qpllclk_quad1_in_i和gt_qpllrefclk_quad1_in_i这两个信号从外接改成了从核的GTX_COMMON模块产生后再接到IP核的方式，问题解决。

全文完。

*博客内容为网友个人发布，仅代表博主个人观点，如有侵权请联系工作人员删除。