News:

Attention: For security reasons,please choose a user name *different* from your login name.
Also make sure to choose a secure password and change it regularly.

Main Menu

FX2 USB PacketSize

Started by aobo song, November 05, 2013, 02:02:51 PM

Previous topic - Next topic

aobo song

Hello all,

I am trying to send data from FPGA to host PC via FX2 USB interface (TE0630) without using MicroBlaze. I am new to USB protocol, so to get started, I followed the help provided in this page: http://forum.trenz-electronic.de/index.php/topic,247.0.html

My aim is to send 8 bytes 100,000 times per second to PC so the data rate is not that high but the FIFO gets full often. I can see it because it shows up as one of the LEDs on the board, it flickers often.

Does this mean the host PC is not reading the data fast enough?


Oleksandr Kiyenko

Hello,
FX2 chip have internal FIFOs for your packets. Each FIFO divided to buffers (depending on configuration from 1 to 4). So when you send packet (any size from 1 byte up to buffer size) it go to current buffer. When you set PKTEND current buffer is closed and next opened. This way if you sending very often very small packets you will waste buffer space and bandwidth (512 bytes buffer will handle only 8 bytes in your case).
Small packets it's not a problem for slow speed communication like RS, but if you want high bandwidth packet size should be closer to buffer size.

Best regards
Oleksandr Kiyenko

Horsa

#2
As explained above by Oleksandr, USB 2.0 is better used with bulk packets of larger size.


1)
Your USB device (Trenz Electronic TE0630 FPGA module) is using Hi-Speed USB => Hi-Speed USB device endpoints may only select a maximum data payload size of 512 bytes => bulk endpoint's wMaxPacketSize is 512.

QuoteCompliant USB 2.0/1.1 drivers must transmit packets of maximum size (wMaxPacketSize) and then either end the transmission by means of a packet of less than maximum size, or delimit the end of the transmission by means of a zero-length packet. The transmission is not complete until the driver sends a packet smaller than wMaxPacketSize. If the transfer size is an exact multiple of the maximum, the driver must send a zero-length delimiting packet to explicitly terminate the transfer

Delimiting the data transmission with zero-length packets, as required by the USB specification, is the responsibility of the device driver. The system USB stack will not generate these packets automatically.
http://msdn.microsoft.com/en-us/library/windows/hardware/ff537069.aspx

If a user wants to force a bulk packet with a size of less than wMaxPacketSize, the user should use PKTEND.


2)
QuoteThe USB is a polled bus, meaning the host controller (the computer in this case) must initiate all transfers. Do not mistake this to mean that the system software must poll the USB. The host controller takes care of polling the bus and can be programmed to issue interrupts to the OS whenever the bus needs attention.
http://wiki.osdev.org/Universal_Serial_Bus#Basic_Concepts_and_Nomenclature

There is no way for a USB device (TE0630) to "interrupt" its host controller (the computer) in the same manner as other hardware interrupts. USB does support an Interrupt transfer method, but this is in fact implemented by polling and the latency one can achieve is about 1 ms, but ultimately limited by the host's performance.

bInterval: Interval for polling a device during a data transfer, expressed in units of microframes for high-speed devices, and frames for low- and full-speed devices.


3)
QuoteDistribution of Bus Access Time: Frames and Microframes

To ensure synchronization between the host and the functions, the USB divides bus time into fixed-length segments. For low- or full-speed buses, the USB divides the bus time into 1 millisecond units, called frames. For a high-speed bus, the USB divides the bus time into 125 microsecond units, called microframes.

Note that frames and microframes do not coexist on one bus; low- and full-speed buses used frames, but in developing a high-speed bus, a shorter frame was necessary because the significantly higher signaling bit rate is more sensitive to smaller shifts in synchronization between the host and the function.

Frames and microframes are mostly a physical-layer detail and should not be confused with any of the previous concepts. Frames and microframes do not correspond to any packet or transaction; in fact, several transactions usually take place during one (micro)frame (a maximum of 13-133 bulk packet can be used in every microframe aka 125 us). The host controller issues a start-of-frame (SOF) packet at the beginning of every (micro)frame. The remainder of the (micro)frame is available for the host controller to carry out transactions. A transaction may not take place if it cannot be completed in the same (micro)frame (because otherwise the next SOF packet would interrupt the transaction).

It is important to realize that the host controller may rearrange transactions to make better use of the available bandwidth. Of course, two transactions through the same pipe must occur in the correct order, but the transactions of two separate transfers may be reordered at the host controller's discretion.
http://wiki.osdev.org/Universal_Serial_Bus#Frames_and_Microframes

The subdivision in bulk packet is decided by the Operating System USB section and device driver, not by the SW API (Trenz Electronic, Cypress or libusb(x)). To force the use of a bulk packet you should use PKTEND.


4)
Another topic is the following:

  • a maximum of 13 bulk packets can be used (with packet size of 512 bytes) in every microframe (125 µs) => 125 µs/13 => 104000 Hz
  • a maximum of 119 bulk packets can be used (with packet size of 8 bytes) in every microframe (125 µs)  => 125 µs/13 => 952000 Hz
   
To put the bytes-per-packet numbers into perspective, the maximum theoretical high-speed bandwidth over USB 2.0 (from the specification) is achieved with 13 BULK packets per microframe. This represents a maximum bandwidth of:
13 * 512 / 125µs = 53.24 MBytes/sec [theoretical max]
(13 packets per microframe, 512 bytes per packet, 125µs per microframe).

If you use PKTEND to force a transaction with only 8 byte you can theoretically achive 119 * 8 / 125µs = 7.616 Mbyte/s (theoretical max). In practice, you can even expect a lower value.
Source: Universal Serial Bus Specification Revision 2.0, Table 5-10. High-speed Bulk Transaction Limits.

  • Data Payload 512 -> Max Transfer 13 -> 53.248 MB/s
  • Data Payload     8 -> Max Transfer 119 -> 7.616 MB/s
http://electronix.ru/forum/index.php?act=Attach&type=post&id=9066
http://www.cypress.com/?rID=12967
http://www.cypress.com/?rID=40037
http://www.usb.org/developers/presentations/pres0501/Shaw_High_Bandwidth_Final.ppt



aobo song

#3
Thanks for the replies Oleksandr and Horsa.

At the moment I do use PKTEND. I send 50*8 bytes at once. The FIFO is configured as 512x2 (3.02 iic).

In my C++ SW I have a loop that attempts to read 512 bytes once and write them into a text file (takes a while to do this). What I see is that the LED (FIFO full) turns on periodically. I modified my SW so it doesn't record the values, instead, it immediately read again and again. What I now observe is that the LED (FIFO full) will be never on.

If I understand this correctly, once I commit the packet using PKTEND, the data goes to the driver buffer that has the size XferSize. Since my SW application is not reading that buffer fast enough, it gets full and then stops any other packets from coming in, therefore causing the FIFO full issue. Would I be correct in saying this?

Horsa

#4
1)
Have you tried to use a thread (with normal to high priority) to read the buffer and another thread (with low to normal priority) to write the txt file?
From your description, it seems likely that you are using a single thread SW that is not able to keep up with the throughput because READ TE USB FX2 driver buffer function and Open/Write/Close file functions are used in the same thread.

If you separate these two actions in two different threads, it is likely that your multiple threaded SW will be able to keep up with the latency.
If your host computer is not responsive, you may need to use a high priority thread to read the TE USB FX2 driver buffer.


2)
Your buffer at the host side shall have the correct size.
See https://wiki.trenz-electronic.de/pages/viewpage.action?pageId=10620656

Even if you decide to move to libusb(x) (see for instance Linux_FUT https://wiki.trenz-electronic.de/display/TEUSB/Linux_FUT) but continue to use a synchronous READ function and a single thread, you should not expect a much higher throughput.
To obtain a (substantially) higher throughput (either with libusb(x) or CyAPI), you should use a packet size of at least 400-500 bytes (the higher, the better), asynchronous functions and multiple threads.


3)
Have you already considered using our VCOM reference design?
https://wiki.trenz-electronic.de/display/TEUSB/Virtual+COM+Port+Interface
Could this reference design suit better your project needs?