News:

Attention: For security reasons,please choose a user name *different* from your login name.
Also make sure to choose a secure password and change it regularly.

Main Menu

TE0820 as PS-PCIe Endpoint

Started by pema, June 13, 2023, 09:12:42 AM

Previous topic - Next topic

pema

Hi John,
I am now trying for some time to use the TE0820 as PS-PCIe endpoint.
Here my steps to configure it.

- Vivado (see the pictures as reference)
    - Enable PCIe als endpoint, Link speed 5GT/s
    - PCIe reset MIO       
    - PCIe clock input 100MHz clock source 3   
    - generate .xsa (hardware description file)     
- ClockBuilderPro
    - Generate .h for the SI5338 with CLK1 @ 100MHz
    - the reference project provided under the TE0820_test was used
- Vitis
    - Import the reference project provided under the TE0820_testboard
    - Change the fsbl to configure the clock generator SI5338
        - under te_Si5338-Registers.h paste the array values generated before under ClockBuilderPro
    - Compile and save the fsbl.elf for petalinux-package
- Petalinux
    - create a new project
    - config the project with the .xsa
    - Enable PCI endpoint device drivers
        - petalinux-config -c kernel -->  Device Drivers--> PCI support
    - Compile
    - pack together with pmu and fsbl generated under Vitis
    - send it to target

Once the system boots there are no references to pcie under dmesg or under /dev. lspci outputs ofcourse nothing.

Is there any other modification required ? Under the FSBL? or config kernel?

Best,
Pema
PS:
Would it be possible to provide the  [TE0820 TD TEF1002](https://wiki.trenz-electronic.de/display/PD/TE0820+TD+TEF1002) BSP compatible with Vivado 2022.0?

M Kirberg

Hi,

everything you did seems correct, nothing to add here.

Ideas:

You could post the relevant parts of your resulting devicetree, to see if everything there turned out as expected.
Double check if corresponding driver is included in your kernel (modules.builtin)

About an updated Reference Design, it is planned, but work has not started yet.

Best regards

pema

Hi M Kirberg,
thanks for your help. The device tree under /components/plnx_workspace/device-tree/device-tree/pcw.dtsi doesn't indeed show pcie.
Any ideas what might be going wrong?


pcw.dtsi

/*
* CAUTION: This file is automatically generated by Xilinx.
* Version: XSCT
* Today is: Wed Jun 14 15:03:32 2023
*/


&gic {
num_cpus = <2>;
num_interrupts = <96>;
};
&lpd_dma_chan1 {
status = "okay";
};
&lpd_dma_chan2 {
status = "okay";
};
&lpd_dma_chan3 {
status = "okay";
};
&lpd_dma_chan4 {
status = "okay";
};
&lpd_dma_chan5 {
status = "okay";
};
&lpd_dma_chan6 {
status = "okay";
};
&lpd_dma_chan7 {
status = "okay";
};
&lpd_dma_chan8 {
status = "okay";
};
&xilinx_ams {
status = "okay";
};
&can1 {
status = "okay";
};
&cci {
status = "okay";
};
&gem3 {
phy-mode = "rgmii-id";
status = "okay";
xlnx,ptp-enet-clock = <0x0>;
};
&fpd_dma_chan1 {
status = "okay";
};
&fpd_dma_chan2 {
status = "okay";
};
&fpd_dma_chan3 {
status = "okay";
};
&fpd_dma_chan4 {
status = "okay";
};
&fpd_dma_chan5 {
status = "okay";
};
&fpd_dma_chan6 {
status = "okay";
};
&fpd_dma_chan7 {
status = "okay";
};
&fpd_dma_chan8 {
status = "okay";
};
&gpio {
emio-gpio-width = <32>;
gpio-mask-high = <0x0>;
gpio-mask-low = <0x5600>;
status = "okay";
};
&gpu {
status = "okay";
xlnx,tz-nonsecure = <0x1>;
};
&i2c0 {
clock-frequency = <400000>;
status = "okay";
};
&i2c1 {
clock-frequency = <400000>;
status = "okay";
};
&qspi {
is-dual = <1>;
num-cs = <1>;
spi-rx-bus-width = <4>;
spi-tx-bus-width = <4>;
status = "okay";
};
&rtc {
status = "okay";
};
&sdhci1 {
clock-frequency = <187481262>;
status = "okay";
xlnx,mio-bank = <0x1>;
};
&psgtr {
status = "okay";
};
&ttc0 {
status = "okay";
};
&ttc1 {
status = "okay";
};
&ttc2 {
status = "okay";
};
&ttc3 {
status = "okay";
};
&uart0 {
cts-override ;
device_type = "serial";
port-number = <0>;
status = "okay";
u-boot,dm-pre-reloc ;
};
&uart1 {
cts-override ;
device_type = "serial";
port-number = <1>;
status = "okay";
u-boot,dm-pre-reloc ;
};
&lpd_watchdog {
status = "okay";
};
&watchdog0 {
status = "okay";
};
&pss_ref_clk {
clock-frequency = <33330000>;
};
&video_clk {
clock-frequency = <33333000>;
};
&ams_ps {
status = "okay";
};
&ams_pl {
status = "okay";
};



zynqmp.dtsi

pcie: pcie@fd0e0000 {
compatible = "xlnx,nwl-pcie-2.11";
status = "disabled";
#address-cells = <3>;
#size-cells = <2>;
#interrupt-cells = <1>;
msi-controller;
device_type = "pci";
interrupt-parent = <&gic>;
interrupts = <0 118 4>,
     <0 117 4>,
     <0 116 4>,
     <0 115 4>, /* MSI_1 [63...32] */
     <0 114 4>; /* MSI_0 [31...0] */
interrupt-names = "misc", "dummy", "intx",
  "msi1", "msi0";
msi-parent = <&pcie>;
reg = <0x0 0xfd0e0000 0x0 0x1000>,
      <0x0 0xfd480000 0x0 0x1000>,
      <0x80 0x00000000 0x0 0x1000000>;
reg-names = "breg", "pcireg", "cfg";
ranges = <0x02000000 0x00000000 0xe0000000 0x00000000 0xe0000000 0x00000000 0x10000000 /* non-prefetchable memory */
  0x43000000 0x00000006 0x00000000 0x00000006 0x00000000 0x00000002 0x00000000>;/* prefetchable memory */
interrupt-map-mask = <0x0 0x0 0x0 0x7>;
bus-range = <0x00 0xff>;
interrupt-map = <0x0 0x0 0x0 0x1 &pcie_intc 0x1>,
<0x0 0x0 0x0 0x2 &pcie_intc 0x2>,
<0x0 0x0 0x0 0x3 &pcie_intc 0x3>,
<0x0 0x0 0x0 0x4 &pcie_intc 0x4>;
iommus = <&smmu 0x4d0>;
power-domains = <&zynqmp_firmware PD_PCIE>;
pcie_intc: legacy-interrupt-controller {
interrupt-controller;
#address-cells = <0>;
#interrupt-cells = <1>;
};



M Kirberg

Hi,

very good question.

XSA is surely correct? You could also check the contents of xsa to see if everything there turned out as expected.

br

pema

#4
Hi,
yes I also checked the contents from the .xsa.
In the zusys.hwh I can see that  the PSU__PCIE__PERIPHERAL is enabled and configured as endpoint.
So the issue happens in the petalinux. Is it possible for you to try on your side ?


...
        <PARAMETER NAME="PSU__PCIE__PERIPHERAL__ENABLE" VALUE="1"/>
        <PARAMETER NAME="PSU__PCIE__PERIPHERAL__ENDPOINT_ENABLE" VALUE="1"/>
        <PARAMETER NAME="PSU__PCIE__PERIPHERAL__ROOTPORT_ENABLE" VALUE="0"/>
        <PARAMETER NAME="PSU__PCIE__PERIPHERAL__ENDPOINT_IO" VALUE="MIO 29"/>
        <PARAMETER NAME="PSU__PCIE__PERIPHERAL__ROOTPORT_IO" VALUE="&lt;Select>"/>
        <PARAMETER NAME="PSU__PCIE__LANE0__ENABLE" VALUE="1"/>
        <PARAMETER NAME="PSU__PCIE__LANE0__IO" VALUE="GT Lane0"/>
        <PARAMETER NAME="PSU__PCIE__LANE1__ENABLE" VALUE="0"/>
        <PARAMETER NAME="PSU__PCIE__LANE1__IO" VALUE="&lt;Select>"/>
        <PARAMETER NAME="PSU__PCIE__LANE2__ENABLE" VALUE="0"/>
        <PARAMETER NAME="PSU__PCIE__LANE2__IO" VALUE="&lt;Select>"/>
        <PARAMETER NAME="PSU__PCIE__LANE3__ENABLE" VALUE="0"/>
        <PARAMETER NAME="PSU__PCIE__LANE3__IO" VALUE="&lt;Select>"/>
        <PARAMETER NAME="PSU__PCIE__RESET__POLARITY" VALUE="Active Low"/>
...

Many thanks

M Kirberg

Hi,

no other outcome for me, it is definitely the devicetree that is not configured automatically.
Was it done correctly in 2021.2?

If so, please start something with AMD Support.

Best

pema

#6
Hi ,
yes same issue with the 2021.2. Is just doesn't appear in the pcw.dtsi .
Have you tried it as endpoint? If so could you please share your setup/step?
Regards,
Pema

M Kirberg

#7
Hi,

our last Reference Design for PCIe is from 2019.2. Apparently devicetree used to be autogenerated there.

I could not yet find a reason why this is no longer autogenerated, but I found something which I gave a customer he used in 2021.2 for simliar configuration.
I think you should just add it manually for now;



#include "include/dt-bindings/phy/phy.h"

&amba {
      refclk2:psgtr_pcie_clock {
          compatible = "fixed-clock";
          #clock-cells = <0x0>;
          clock-frequency = <100000000>;
      };
};

&psgtr {
       status = "okay";
       #clock-cells = <0x01>;
       clocks = <&refclk2>;
       clock-names = "ref2";
};

&pcie {
      status = "okay";
      phy-names="pciephy";
      phys = <&psgtr 0x0 PHY_TYPE_PCIE 0x0 0x0>;

};


(This is for Lane0 with clock on Input2)


pema

#8
Hi,
thanks again for your fast reply.
You mean add it to the <my_pt_lnx_project>/project-spec/meta-user/recipes-bsp/device-tree/files/system-user.dtsi  ?

Followed by ?
petalinux-build -c device-tree -x cleanall
petalinux-build -c device-tree

thanks

M Kirberg


pema

#10
Many thanks!
Any ideas on how to check if it worked ?
fdtdump images/linux/system.dtb |grep pcie

I use reference clock 3 and one lane or the TE0820 does at least. Would this be okay ?

#include "include/dt-bindings/phy/phy.h"

&amba {
      refclk3:psgtr_pcie_clock {
          compatible = "fixed-clock";
          #clock-cells = <0x0>;
          clock-frequency = <100000000>;
      };
};

&psgtr {
       status = "okay";
       #clock-cells = <0x01>;
       clocks = <&refclk3>;
       clock-names = "ref3";
};

&pcie {
      status = "okay";
      phy-names="pciephy";
      phys = <&psgtr 0x0 PHY_TYPE_PCIE 0x0 0x0>;

};


This already showed the pcie device before.

M Kirberg

Yes looks good.
Just grep for "ref3" or see if pcie is enabled now, status was disabled before.

pema

Thanks. I am not quite yet there, but getting.
With
$ dtc -I dtb -O dts images/linux/system.dtb
I now get references to refclk3 and pcie :
pcie = "/axi/pcie@fd0e0000";
pcie_intc = "/axi/pcie@fd0e0000/legacy-interrupt-controller";
...
refclk3 = "/axi/psgtr_pcie_clock";


Everything gets compiled and image packed without any issues.
But then, Once the kernel starts loading I get a kernel panic :(

[    4.448320] nwl-pcie fd0e0000.pcie: host bridge /axi/pcie@fd0e0000 ranges:
[    4.455218] nwl-pcie fd0e0000.pcie:      MEM 0x00e0000000..0x00efffffff -> 0x00e0000000
[    4.463223] nwl-pcie fd0e0000.pcie:      MEM 0x0600000000..0x07ffffffff -> 0x0600000000
[    4.471305] SError Interrupt on CPU1, code 0xbf000002 -- SError
[    4.471312] CPU: 1 PID: 79 Comm: kworker/u4:1 Not tainted 5.15.36-xilinx-v2022.2 #1
[    4.471319] Hardware name: xlnx,zynqmp (DT)
[    4.471323] Workqueue: events_unbound deferred_probe_work_func
[    4.471339] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    4.471345] pc : nwl_pcie_link_up.isra.0+0xc/0x2c
[    4.471355] lr : nwl_pcie_probe+0x358/0x960
[    4.471361] sp : ffff800009713b40
[    4.471364] x29: ffff800009713b40 x28: ffff000001960b00 x27: 0000000000000000
[    4.471373] x26: 00000000000186a0 x25: 0000000000015f90 x24: 000000000000000a
[    4.471381] x23: ffff000006cd9800 x22: ffff000001a97410 x21: ffff000001a97410
[    4.471390] x20: ffff000006cd9b80 x19: ffff000001a97410 x18: 0000000000000000
[    4.471398] x17: 30203e2d20666666 x16: 6666666666373078 x15: 0000000000000001
[    4.471406] x14: 0000000000000000 x13: 0000000000000018 x12: 0000000000000040
[    4.471413] x11: 0000000000000005 x10: 0101010101010101 x9 : 0000000000000000
[    4.471421] x8 : 7f7f7f7f7f7f7f7f x7 : fefefeff646c606d x6 : 070f1900f2e3efee
[    4.471430] x5 : ffff8000094df558 x4 : 0000000000000000 x3 : 0000000000000002
[    4.471437] x2 : ffff8000095e5234 x1 : ffff80000b000018 x0 : 0000000000000002
[    4.471447] Kernel panic - not syncing: Asynchronous SError Interrupt
[    4.471451] CPU: 1 PID: 79 Comm: kworker/u4:1 Not tainted 5.15.36-xilinx-v2022.2 #1
[    4.471457] Hardware name: xlnx,zynqmp (DT)
[    4.471459] Workqueue: events_unbound deferred_probe_work_func
[    4.471467] Call trace:
[    4.471469]  dump_backtrace+0x0/0x190
[    4.471479]  show_stack+0x18/0x30
[    4.471486]  dump_stack_lvl+0x7c/0xa0
[    4.471493]  dump_stack+0x18/0x34
[    4.471499]  panic+0x14c/0x30c
[    4.471506]  add_taint+0x0/0xb0
[    4.471512]  arm64_serror_panic+0x6c/0x7c
[    4.471517]  do_serror+0x28/0x60
[    4.471522]  el1h_64_error_handler+0x30/0x50
[    4.471530]  el1h_64_error+0x78/0x7c
[    4.471535]  nwl_pcie_link_up.isra.0+0xc/0x2c
[    4.471541]  platform_probe+0x68/0xe0
[    4.471547]  really_probe.part.0+0x9c/0x30c
[    4.471554]  __driver_probe_device+0x98/0x144
[    4.471561]  driver_probe_device+0x44/0x11c
[    4.471569]  __device_attach_driver+0xb4/0x120
[    4.471576]  bus_for_each_drv+0x78/0xd0
[    4.471583]  __device_attach+0xdc/0x184
[    4.471590]  device_initial_probe+0x14/0x20
[    4.471597]  bus_probe_device+0x9c/0xa4
[    4.471604]  deferred_probe_work_func+0x88/0xc0
[    4.471611]  process_one_work+0x1d8/0x390
[    4.471618]  worker_thread+0x298/0x4e0
[    4.471623]  kthread+0x120/0x130
[    4.471631]  ret_from_fork+0x10/0x20
[    4.471638] SMP: stopping secondary CPUs
[    4.471644] Kernel Offset: disabled
[    4.471645] CPU features: 0x00002001,00000842
[    4.471649] Memory Limit: none
[    4.721466] ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]---


Any interrupt source is now causing this. Now it looks like pcie at least shows up during the boot. On the other hand is cause kernel panic.
Any thoughts?





M Kirberg

Thoughts:

Reference Clock: does PLL lock correctly after programming Si? (you could check e.g. with IBERT Design)

Reset Logic: PCIE_PERST, who and when is it pulled?


pema

Reference Clock: I have no experience with IBERT yet. Might take some time here. Perhaps in the mean time I will just try to measure the clock output from the SI5338A* (although might be tricky since there are no testpads in between).

Reset Logic:at the moment is just "hanging" no root device is connected. Although I don't understand how can this be an issue since the reset input as an active pull-up.


M Kirberg

Reset is forwarded via CPLD to MIO33 on FPGA.

I think that is definitely wrong on your PS setup.
Might be pulled down exterally at the moment?

pema

Sorry, but now you lost me.
You mean the CPLD on the TE0820?  The U21 ? I don't see any MIO33 attached to it? 
Should I use anyother reset input source? If so which would be the best?

Or perhaps you are referring to the TEF1002 carrier board CPLD(is there a CPLD?)?
Either way I am currently using the TE760 carrier board(with soldered PCIe lane0 to the B2B connector. I am  waiting on the TEF1002 to be delivered. In the mean while just would like to test the PCIe as endpoint.



M Kirberg

I mean CPLD on TEF1002 yes. MIO33 would only be valid for TEF1002.

You desribed nowhere that you did not use actual TEF1002...

I have no idea which signals you all connected, but I think now you error is somewhere here and can still be in conjunction with reset line.
History of my mail log shows similar crash to yours, which worked after PLL and reset were correct...




pema

Yes, forgot to mention that "small" detail. I have the PCIe lane0 and RST pin connected only. Clock mode is "Separate Clock architecture".  The hardware modification was quite simple and straightforward.

Like I said, I would like to start all this PCIe endeavor on the Xilinx MPSoC with the TEF1002, but will probably have to wait 2/3 weeks until I get the carrier-board.
In the mean time I would like to configure my build image setup with what I have. I will configure my pcie rst pin on the MIO33 as well.
Would it be possible for you to provide the TEF1002 petalinux folders:
<petalinux_project>components/plnx_workspace/device-tree/
and
<petalinux_project>/project-spec/meta-user/recipes-bsp/device-tree/files/

Or the .XSA generated from vivado?

I could perhaps compare the TEF1002 DT with mine.







M Kirberg

Hi,

old Reference Design for 2019.2 can be found here:
https://shop.trenz-electronic.de/de/Download/?path=Trenz_Electronic/Modules_and_Module_Carriers/4x5/TE0820/Reference_Design/2019.2/TD_TEF1002
This should have correct settings for TEF1002, so you can compare

The devicetree you have right now should be correct.

pema

Hi,
yes the device tree looks correct. But why is there an interrupt causing kernel panic?
From the TEF1002 reference design config looks exactly like mine(for the PCIe).  I currently use the MIO33 as PERST as well.
I also double checked if the Si5338 setup under FSBL by debugging it under Vitis ). Looks correct as well. Still I will measure the clock to guarantee that I have the 100 MHZ.

I am still banging my head on this one :(. If you have any ideas please don't hesitate to share. ;)

M Kirberg

Error happens here, when trying to access pcireg register.

https://github.com/Xilinx/linux-xlnx/blob/xilinx-v2022.2/drivers/pci/controller/pcie-xilinx-nwl.c#L189

I suppose this means this is not accessible.

As suggested, this could be clock or reset related according to my mail backlog...

pema

Hi M Kirberg,
yes that looks like the issue. I have been digging a little deeper. The Clock generated by the SI5338 ( the 100MHz on the output 1) is definitely there. So this is properly set under the FSBL (TE modified).
What I also notice is that when I used the xilinx FSBL all the MIOS and PCIe peripherals are set correctly and not causing kernel panic. But then the SI5338 is not configured, since this is not coverred by Xilinx FSBL ( I assume they dont have it in their devKits).
Either way, this made me realize that the error source can only come from here since Xilinx uses: psu_init(void) and TE a modified version.(TE_XFsbl_TPSU_MODIFIED  under te_cfsbl_hooks_te0820.c)

I also looked into TE_XFsbl_TPSU_MODIFIED , there seams to be only references to Eth and USB reset...

#define USE_TE_PSU_FOR_SI_INIT //enable TE PSU to write SI on the correct place in the FSBL (Xilinx default PSU is deactivated)

Have you implemented a version where you setup the Si clockgenerator and use the default PSU_init ?
Thanks

M Kirberg

TE FSBL modification only do additional stuff that Xilinx does not do (Pulling ETH RST and configuring Clock Chips). It should have no other side effects.

However custom xilinx hooks did not seem to be enough for some task so we extract some functions.
This can be messed up now with version of 2022.2 you are using?

It can also very well be that no clock conceals the erros in Linux...

pema

Hi,
I am currently using 2023.1 for Vivado, Vitis and petalinux(also tried with other versions same issue here). The modifications from Xilinx on the FSBL are minor compared to the version used by you (2019 I think). Withou counting with the hooks implemented by Trenz.

Either way I am still fighting with this issue. Like I said the clock(100 Mhz is there). The reset Pin is pulled up. But when I try to read it it gives always low
I added this under the te_xfsbl_hooks_te0820.c >> TE_XFsbl_BoardInit_Custom(void)


//DATA_0_RO, except that it reflects bank1, which corresponds to MIO[51:26]. >> ((GPIO_BASEADDR ) + 0X00000064U )

  RegVal = XFsbl_In32(GPIO_DATA_1_RO) ; // ((GPIO_BASEADDR ) + 0X00000064U )
  temp = ((RegVal) & (GPIO_MIO29_MASK))>>3;
  if (temp!=0x1) {
    xil_printf("PCIe is hold into reset. (GPIO_DATA_1_RO, Val:%x)\r\n", RegVal);


It starts to feel like I am looking for a needle in a haystack.  :-[
I guess I could read "all" the registers from PCIe module before starting the linux.  https://www.xilinx.com/htmldocs/registers/ug1087/ug1087-zynq-ultrascale-registers.html#mod___axipcie_main.html
PCIE_ATTRIB might be the best to start with ?  PCIE_STATUS,  EP_CTRL

Am I even looking in the right direction? Is this an issue caused by misconfiguration from the FSBL or is it still a device tree ?
Thanks once again. Have a nice weekend!

pema

Hi,
I disabled the NWL drivers. I dunno why it was enabled had it enabled since it should be configured as EP.
As far as I understand the nwl driver is for root ports.
This change eliminates the kernel crash.

pema

Hi there again,
well I am now back to the ps-pcie EP issue. I remembered I disabled the NWL bridge controller drivers(since in this was being said to work only for the Root complex mode). 
The problem is if I disable the NWL PCIe Core drivers the device is no longer found on (therefor no probe takes place). If I enable them and I try to write to the BARs I get :

nwl-pcie fd0e0000.pcie: Unsupported request Detected


In Baremetal I was able to perform a basic example and get the PCIe EP to work based on the https://github.com/Xilinx/embeddedsw/blob/master/XilinxProcessorIPLib/drivers/pciepsu/examples/xpciepsu_ep_enable_example.c

I realize that this is perhaps a question that should be directed to AMD/Xilinx rather than Trenz, but perhaps you have already used the TR0820 +TEF1002 as EP with Linux. Or perhaps you already have a demo for the TEF1002 carrier board?
I would appreciate any help.
Best




pema

Well, this doesn't look like a very approached subject.
Perhaps because EP in this SOC is not as much used as RC. Also perhaps because  PCI Express CEM Specification defines a 100-msec rule from the de-assertion time of the PERST# (slot reset) to the time that a PCI Express root complex (host) is allowed to probe the connected downstream endpoint.
At the time being Xilinx/AMD does not provide  the device driver for PCIe EP controller. Just for Host/RC.
This could perhaps be modified to EP as well and make it usable in the PCI EP Framework.

https://github.com/Xilinx/linux-xlnx/blob/master/drivers/pci/controller/pcie-xilinx-nwl.c