News:

Attention: For security reasons,please choose a user name *different* from your login name.
Also make sure to choose a secure password and change it regularly.

Main Menu

PL clock failure

Started by charlie5902, October 17, 2018, 05:36:50 PM

Previous topic - Next topic

charlie5902

Hi,

So I have been working with TEBF0808-04A carrier and TE0803 module for about a month now and having great success developing my Vivado project for our application.
Yesterday, I ran into a situation where it appears the PL clock is no longer functional.
I don't know if it is a coincidence, but the failure occurred at the time when I lowered the JTAG Clock frequency to accommodate an ILA being clocked at a low frequency for debug.
I am now no longer able to communicate with PL components over the AXI bus, software crashes with any PL AXI slave access and my ILAs are not showing up as present after initializing the psu and a hardware refresh. I have rebuilt the project and tried other projects that used to work OK, power cycled the board, my laptop, etc. Problem still persists.
All of this is consistent with PL clock not running.

Is there any way that lowering the JTAG Clock frequency could have cause this kind of issue.

Any thoughts on potential recovery methods I could try?

Thanks for any help.

Charlie

JH

Hi Charlie,

normally JTAG Debug Interface should not have such a affect on PS-PL CLK, very strange.

We know from an interaction of JTAG Debugger interface with Linux and UART since Vivado 2017.4, but this was solved if we disable Frequenzscalling on Linux:
But in this case it was not permanent.

Can you try out one time the prebuild files from the reference design please? Docu + Download:

Reference Design used VIO instead of ILA.
Can you send me result of this test.

br
John

charlie5902

Hi John,

Thanks for the reply and suggestion!!

So that did the trick - those binaries work. So something is wrong with project/environment I guess.

I tried to buiild the starter kit reference - the tcl script still fails during block design creation/connection. But I am able to fix it up so it builds and the VIO shows up in hardware manager. I can add an ILA and it shows up too. So it means my hardware is not damaged/failed.

However, when I switched to loading my bitstream - the fan died during configure and my ILAs do not show up and I can't access AXI slaves in PL from ARM. Again, this all had been working for me for a long time, 6 weeks or so.

My project does not have the RGPIO or CPLD interfaces in it - can those inputs to the carrier being left floating have anything to do with it do you think?



JH

Hi,
select even ID on design_basic_settings.cmd. We provide 2 different board part files, one is for TEBF0808 and one only minimal configuration.
See also:RGPIO is only a serial communication interface between CPLDs and FPGA. It's optional.
It's very strange with your PL CLKs. What's running on PS side in your design?

br
John

charlie5902

Hi John,

So some new info here:

1. I was using part #7, switched to #8 and the Starter Kit project builds to completion. The VIO is not present in hardware manager after I run a hello world SDK app to get the psu system initialized and PL clocks running. And yes I always make sure to do a hardware refresh after starting the app. Same case if I use the prebuilt binaries version or the no-prebuilt binaries.

2. I tried using the test board project as well, with a gpio and ila added - same results. Software crashed when trying to access the Axi GPIO IP core and ILA does not show up. Both are signs of missing PL clock.  Side note: the test board project has a small error in project creation, validate design fails due to unconnected live audio clocks. I just disable live audio in PS config and generate a wrapper and build.

3. The configuration that kind of worked for me was when I used Starter Kit with part #7, project creation fails about halfway through, I finish the design by making missing connection, stubbing some out. I was able to access my ILA and GPIO originally but when I tried to re-create this, it does not work a second time.

4. I have tried many experiments to both projects to see if I get a clock back -to no avail.

5. I tried re-installing Vivado and reproduce those experiments. Nothing seems to work.

6. In all cases, when I open the implemented design and look at the schematic, I see a dbg_hug and connected PS-PL clock to it.

7. I installed Vivado 2018.2 on a windows laptop and tried the test board project. same results - no dbg_hub or ILA.

8. On the PS side, I have an SDK-generated hello world app. The only thing I add to it is a Axi GPIO access to test PL aXi Slave operation.

9. I have tried running the generated psu_init.tcl without software running, then refresh hardware. still nothing.

It sure seems like my hardware is defective. What do you think?

Charlie
+

JH

Hi Charlie,
at first:
Starterkit reference design use even number, where the board part files end of *_tebf0808: https://wiki.trenz-electronic.de/display/PD/TE0803+StarterKit#TE0803StarterKit-DesignFlow
Test board reference design use odd number, where the boart part files not ends of *_tbf0808: https://wiki.trenz-electronic.de/display/PD/TE0803+Test+Board#TE0803TestBoard-DesignFlow

At the moment 2 different types of board part files are included for every assembly variant in in booth projects. One shows the configuration of the ZynqMP for main TEBF0808 periphery and the other one only module periphery (DDR,QSPI and UART) UART depends normally also on carrier connection, but without UART no output on the console so the same like on TEBF0808 is used.
To your module problem with the internal CLK. It sounds like something is broken inside the SoC. Can you write to support@trenz-electronic.de, we can check this one time.

br
John

charlie5902

Hi John,

Thanks for you helpful replies!

So with your help,  I am getting some traction now. test board project with part number 7 is now working properly for me. It builds to completion without intervention, and software is able to access Axi slave IP and my ILA shows up in hardware manager.

Starter Kit project is a different story. So I have correctly set the part number to 8 for starter kit and rebuilt.Block design creation succeeds all the way to completion. However, the Run Block Automation prompt comes up and never goes away when I run it. I try a couple times and no change. I go ahead and build anyway, but i am concerned a necessary setting is not getting applied. So I build and configure the board and launch SDK and try to launch the application using debugger (same as with test board) and I get errors that prevent launching the debugger. Errors are coming from psu_init.tcl, mask poll timeouts on 0xFD4063E4 and 0xFD4023E4. I remove those mask polls from psu_init.tcl and I can launch and run my app and ILA and GPIO show up fine.

Any thoughts on the Mask poll timeout errors?

Thanks,
Charlie

JH

Hi,

Starterkit design has modified FSBL for SI5338 PLL initialisation --> needed for reference clks of GTP for PCIe, USB3, DP, SATA. Xilinx init script does not include this changes. Without CLK, you got this error message:
--> search for "0xFD4063E4 "

So disable Xilinx init script (like you has done) or configure SI5338 before you start (start system with starterkit Boot.bin, change Boot Mode to JTAG and press reset (Not Power off!)) .

So on testboard ILA it works? Check selected PS-PL Clock PLL in the VIVADO ZynqMP IP. I think it's different one. Maybe one of this internal PLLs has a problem. In case you did not use DP, you can also change on Starterkit design (Disable DP and regenerate linux). Problem with DP is, that Linux change the frequency on DP initialisation and so this internal PLLs will stopped for reconfiguration (--> Vivado add warnings, if you use DP PLL also for other interfaces).

br
John


JH

I forgot:
ignore this " Run Block Automation" on this version after IP is configured with our board part files. I will set the property to suppress this message on the next design update.
Unfortunately Xilinx has no documentation for IP properties, so we must find out this additional configurations in the most case by us self.
br
John 

charlie5902

Hi John,

Thanks again for helpful replies.

I am still not out of the woods.

This morning, the Start Kit projects that worked at end of day yesterday do not work this morning. Same bitstream, same software app but this morning no ILA or Axi Slave access. This is on 2 identically configured boards. My test board projects still work.

This is starting to sound a lot like flaky PLL operation in the Starter Kit configuration.

Couple things to run by you:

1. I am not running Linux, it is Xilinx SDK bare metal with a generated Hello World app -- that is it except I add a couple line of gpio support to test Axi Slave access.

2. Software operation is JTAG only -- no FSBL is in the picture at all. I load the bitstream via Hardware Manager and launch Hello World app via debugger and psu_init.tcl configure script.

3. I am using a 12V power supply to power carrier, connected to J25 on TEBF-0808-04A carrier. Could it be possible that I would get more stability with a supply connected to J20?

4. Jtag is connected to XMOD1.

5. S5 switches are set for JTAG boot OFF-OFF-OFF-OFF.

Any thoughts on any of this?



JH

Hi Charlie,

Quote
This is starting to sound a lot like flaky PLL operation in the Starter Kit configuration.
PLL setup depends always on your design. If you use PS-GTPs you need valid GTP reference CLKs, one way to get them is to configure SI5338 over FSBL. Xilinx use also FSBL for there evaluation board initialisation, see for example default FSBL source code (xfsbl_board.c), there a defines for the evaluation board. Xilinx has the advantage to include there board specific configuration directly into the default FSBL, so Xilinx evaluation boards are much easier to handle.

ZynqMP internal PLL selection depends always on you selected interface. Especially for  Display Port. If you use DP, you need one of the available internal PLLs only for DP_VIDEO (VPLL) and one internal PLL only for DP_AUDIO(RPLL). This selection must be done manually, automatically selection from IP is not always correct at the moment. DPLL is mainly used for DDR, if you use this PLL for other periphery, you must always check, the DDR CLK is calculateted correctly after changing selection. And at the end not so much PLLs available on the Starterkit design...test board design has minimal configuration, only one PLL for DDR4 is fix and there are no other interface activated.

Quotethe Start Kit projects that worked at end of day yesterday do not work this morning
and
QuoteMy test board projects still work
If you did not use DP, disable DP on the starterkit IP and use PLLs like on the Test Board design (VPLL for TOPSW_MAIN, RPLL for IOU_SWITCH and PL CLKs) This PL-CLK selection can be the critical point between the 2 designs, but until know I did not find any reason, why the starterkit configuration should not work:

At the end I think there is something broken inside your SoC, maybe effect depends on the load of used PLLs it's hard to say.

br
John 



charlie5902

Hi John,

Just to follow through on all this:

1. At this point, the only thing that that does not work is the stock Starter Kit project plus AXI Gpio and ILA, using bare metal Xilinx SDK and a sdk-generated Hello World project. The init script generated by the SDK appears to incorrectly configure the PL clock PLLs, which causes the issues I described. This Starter Kit PL clock issue happens on 2 identical Trenz Carrier/SOM dev platforms.

2. My thought is that with Petalinux and FSBL this issue may not exist. We are going to be moving to Petalinux soon. Right now I am doing low level development of our IP in the PL using bare metal SDK to vet the IP and interfaces to ARM.

3. My current Vivado project has a working PLL configuration. It was generated from a Starter Kit as baseline but has Carrier stuff stripped out.

4. As we are evaluating this platform for eventual use in final product, we would like to be confident there are no risk issues with the hardware. To that end, it would be really helpful to have a Starter Kit project that runs correctly with an XSDK simple Hello World project that does not have PL Clock PLL issues. Any thoughts you have on making this happen would be greatly appreciated.

Thanks again for assistance in working through this.

Charlie


JH

Hi,
Quote1. At this point, the only thing that that does not work is the stock Starter Kit project plus AXI Gpio and ILA, using bare metal Xilinx SDK and a sdk-generated Hello World project. The init script generated by the SDK appears to incorrectly configure the PL clock PLLs, which causes the issues I described. This Starter Kit PL clock issue happens on 2 identical Trenz Carrier/SOM dev platforms.
As I wrote before, Xilinx init scripts does not initialise the SI5338, so no GTP Reference CLKS are available! Disable GTP Interfaces(PCIe,SATA,USB3) or initialise SI5338 with correct FSBL before you start debugger.
Quote2. My thought is that with Petalinux and FSBL this issue may not exist. We are going to be moving to Petalinux soon.
Reference Design is also with petalinux available, you can try out if FSBL generated by petalinux make a difference but i think this will not help
Quote3. My current Vivado project has a working PLL configuration. It was generated from a Starter Kit as baseline but has Carrier stuff stripped out.
Good, but I think you has another PS Configuration know, correct. So you board can still have this problem, but you didn't use this part at the moment.
Quote4. As we are evaluating this platform for eventual use in final product, we would like to be confident there are no risk issues with the hardware. To that end, it would be really helpful to have a Starter Kit project that runs correctly with an XSDK simple Hello World project that does not have PL Clock PLL issues. Any thoughts you have on making this happen would be greatly appreciated.
See answer of your first question. Disabled GTP relevant interfaces and SDK debugger init script starts. For simple SDK debugging with hello world use test_board design, this interfaces are disabled there.

br
John

charlie5902

Okay John, thanks for the reply. I think I understand now about the GTP interfaces. I will give that a try.

Charlie