Hoping to get some expert advice here,
Have an embedded USB application running on a Cortex M4. Application uses full-speed USB and is based on a Freescale-provided CDC USB sample project. My bulk "IN" endpoint is defined for 64 bytes. Embedded application is communicating with an XP-SP3 2.5GHZ Core2 Duo laptop using the usbser.sys device driver and a COM program. I'm also running a software USB analyzer (USBlyzer). I have come up to speed on USB about as much as I humanly can in a few week's time, including reading Jan Axelson's "USB Complete", so I'd like to think I know mostly what is going on.
Every millisecond, my embedded application needs to send to the host PC roughly 500 bytes in a single "IN" transaction. According to everything I read, I should be able to do this by sending a burst of eight 64 byte packets followed by a ZLP to terminate the transaction. So here's what's happening; the terminal application polls my device with an IN request using a 4096 byte buffer. This generates a "token received" interrupt on my embedded device which I _immediately_ service by sending a burst of 8 consecutive packets followed by a ZLP. (I also made sure to wait for clear of the "OWN" bit during this burst). Half these packets appear to be lost on the other end. The only way I can seem to get reliable transfer is if I send single packet per transaction and wait for the host PC to send the next "IN" request before starting a new transaction. This winds up killing my performance.
Am I simply asking too much of the usbser.sys driver or am I missing something simple here?
First of all, thank you for sharing your time and expertise. It is incredibly helpful to have someone who understand the gory details of USB to bounce this stuff off of.
>If you see 192 bytes on USBlyzer, the device sends just 192 bytes (three 64-bytes transactions and ZLP). If this is result of 512 bytes transfer, your firmware still have trouble for 512 bytes transfer.
I do indeed only see 192 bytes on USBlyzer. Digging a little deeper, the issue is that the Cortex USB's SIE does not release the "OWN" bit after 3 consecutive 64 byte packets. It is possibly waiting for an ACK that never comes? So the higher level routine is packetizing correctly, but at the SIE level, the transfer stops at 3 packets. The OWN bit does eventually clear, after a millisecond or so. I've also looked at the DATA0/1 toggle bit and that seems to be handled correctly.
> OR
- Your PC application reads just 192 bytes. Read more.
I'm thinking this may be the case. I've ran this test with both Tera Term and Hyperterminal and per USBlyzer, they both seem to be making 4096 byte IN requests. I may try writing a quick VB.net application and invoking the "SerialPort.WriteBufferSize" call as you suggest.
However, stepping back and looking at the overhead required on the embedded side, I think a better solution for me is to use the high speed USB port on the K70 Kinetis processor instead. Even if I do clear this last hurdle and get 512 byte transactions going, there is still quite a bit of overhead required to babysit the SIE when you packetize BULK transfers with USB 1.1. (Not too unlike a UART) For example, on a Cortex M4 with a 100MHz main core frequency, a single 64 packet write, including waiting for the OWN bit to clear takes roughly 70 uSec. For my application, this is simply too long.
High speed USB should resolve this because I can fire off a single, 512 byte packet every millisecond. Now I simply have to port Freescale's CDC device example from full speed to high speed. Fun. :)