#milkymist IRC log for Saturday, 2011-11-19

wpwraklet's see if git-send-email works ...04:52
kristianpaulseems it does05:00
wolfspraulwpwrak: oh wow, lots of patches05:00
wolfspraul'full-speed almost works now', that sounds like the rest can also be fixed in software?05:01
wolfspraulI am referring to my question from a few days ago whether we can rule out hardware bugs or needed improvements to support full-speed.05:01
wolfspraulI guess now we can?05:01
wpwraklet's see. it could still be that we're weak in the analog domain05:25
wpwrakbut at least we're not hopeless :)05:25
wpwraknow i'm unrolling the loop of usb_rx ... that should improve timing quite a lot ... and make the code really cryptic ;-)05:26
wolfspraul[analog domain] you mean the signals going out on the wire are not clean, wrong timing?05:39
wolfspraulcan't we just look at that on a scope and tell whether it's good or not?05:39
wpwrakthere could be signal distortion, yes05:39
wpwraktricky. a 100 MHz / 400 MSa/s scope should be sufficient for this kind of task, but i get a lot of noise, so it's hard to see the exact shape05:40
wolfsprauland the noise is coming from the m1 board?05:43
wpwrakanother way to find out is to check the CRC, and count error05:43
wpwrakno, from my environment the the measurement setup05:43
wolfspraulshould adam take a quick measurement? when he goes to minbo or xray shop or other places, maybe they have a good scope standing around - if measurement is easy...05:44
wolfspraulsounds like just power m1, plug in usb keyboard, take measurement05:45
wpwraksomething like that, yes :)05:46
wpwrakyou have to be careful not to distort the signal with the probe05:46
wpwrakusb doesn't like capacitative loads on the signals05:47
wpwrakalready full-speed is a bit on the finicky side05:47
wolfspraulhmm, well. should we do it or not?05:48
wolfspraulin case we do, can you describe in a line or two how you would do the measurement setup?05:49
wpwrakcan't hurt if he has a look, if he feels it's useful. there's always the chance of someone noticing something. and maybe he can find a fast scope with a FET probe. like that GHz monster we had at openmoko. with such a device, you rule :)05:49
wpwrak(okay, that one ran windows, so in that particular case, you still lose)05:50
wpwrakhmm. 12 bit times, down from 16. still darn tight.05:52
wpwrak(sample size 1, as any good statistician would)05:53
mumptaifull-speed usb should be below 100MHz bandwidth05:59
wpwraklet's see ... full-speed rise/fall time is 4-20 ns. let's say 10 ns. for an accuracy of 5% or better, according to lecroy, the scope's rise/fall time should be 1/3 oft that.06:13
wpwrakthe 3 dB rise/fall time would be 1/3/analog_bw. so yes, a 100 MHz scope would just do. the next problem is the sample rate. some low-cost brands have really low sample rates. e.g., mine does typically only up to 200 MSa/s. 400 MSa/s if i use only one channel.06:15
wpwrak(rigol improved that in the successor)06:16
wpwrakthe rule of thumb is somewhere around sample rate = 5 x signal bandwidth06:16
wpwrakso my scope could barely make sense of such a signal06:17
wpwrakof course, i could spot any truly massive distortion, but i couldn't tell with certainty whether the signal meets specs06:17
wpwrakwolfspraul: one of these cases where expensive tools do improve the results. with a slow scope, there's just a lot you can't see06:23
wpwrakgrr. 15 cycles. despite adding extra exit points to the poll loop. this is messy.06:24
wpwrakhehe, 6 cycles. that's more like it ;)06:33
wolfspraulwpwrak: oh, you totally misquote me and out of context too. I have not said anything against expensive tools :-)06:53
wolfspraulit's safe to assume that in most cases there is a reason they are expensive, i.e. someone buys them and they create that value.06:53
wolfspraulthat's why I'm wondering now how we can effectively and quickly rule out analog usb issues... very aware that bad tools may complicate our effort and even lead us in the wrong direction.06:54
wolfspraulone time we had this nasty potential sdram problem, and Sebastien got access to a high-end scope at Xilinx to track down the issue... which was great!06:54
wolfspraulotherwise I take the risk on manufacturing to produce something I need to repair or even discard later06:54
wolfspraulhere http://www.milkymist.org/wiki/index.php?title=RC1_signal_integrity_measurements06:57
wpwrakyeah. it's nice to have definite answers on such issues. not things that are guesstimates based on indirect evidence, etc. a bit like they do astronomy of distant planets :)06:57
wolfspraulman, I cannot believe I still haven't done another news release06:57
wolfspraulurgh06:57
wolfspraulMUST get it done before Monday!06:57
wolfspraul(note to self)06:57
wpwrakeleven days left until the date for the quarterly news :)06:57
wpwrakor sooner. always better :)06:57
sb0http://www.xilinx.com/txpatches/pub/documentation/misc/improving%20ddr%20sdram%20efficiency.pdf16:35
Action: kristianpaul click16:36
kristianpaulsb0: is not what you do with hpdmc ?16:43
sb0no, hpdmc is in-order16:44
sb0with fast page mode and pipelining16:44
sb0I don't know if the article says it (haven't finished reading yet) but OOO controllers have a latency penalty16:44
sb0if the cores using the DRAM can't maintain performance in the presence of the extra latency, the benefit of OOO diminishes16:46
sb0but I think the future is OOO and prefetching. especially if we use DDR3 (or more) someday.16:48
sb0well it just says: "However, since reordering improves efficiency and therefore reduces memory controller occupancy, the result is to improve average read latency.". I'm not sure the case is so clear-cut, especially with DDR116:51
wpwrakah, what would be needed to use DDR2/3 in M1 ? could pin-compatible chips plus a SoC change be enuogh ? or would it be a more complex redesign ?16:58
kristianpauloh, FEL was removed from F16, sb0 ?16:59
sb0way more complex redesign16:59
sb0different voltage, different package, different pinout, different timings16:59
sb0DDR 2/3 is BGA16:59
kristianpaulbut DDD3 with a 80Mhz worth ?16:59
wpwrakah, pity. it's never easy, isn't it ? :(17:00
kristianpaul80mhz soc*17:00
sb0we'd use a clock multiplier17:00
sb0and probably the memory controller would generate several DRAM commands in one cycle at 80MHz that would then be serialized and sent into the DRAM in 10 cycles at 800MHz17:01
kristianpaul:-)17:01
sb0wpwrak, I think there is already a 4x increase in memory bandwidth possible with the current system17:02
sb0we can use a 2x clock multiplier and serdes17:02
sb0and get another 2x increase with OOO and prefetching17:02
wpwrakoh, wow17:02
wpwrak1080p and deep color, here we come ! ;-)17:03
sb0btw, the TMU already has an experimental prefetch system in SoC head17:03
sb0'experimental' = it works, but the performance boost isn't so high17:04
sb0some 30% with the current memory system17:04
sb0if you want to know how it's done, it's explained here: http://www.graphics.stanford.edu/papers/texture_prefetch/texture_prefetch_down.pdf17:06
sb0without the reorder buffer of course, there's instead a simple register that holds one transaction17:06
sb0that's actually one big limiting factor17:07
sb0s/register/FIFO17:09
wpwrakhmm. something is still screwy with USB timing. that MS mouse comes and goes. make a small tweak and it dies. make another tweak and it works again.17:12
sb0full HD would need more unfortunately, especially if we use 10:10:10 colors (which I think is the right thing to do, with more potential than resolution)17:12
sb0but we can still scale on the fly (-:C17:13
sb0(for the GUI it should be OK though)17:13
wpwrakhm yes, 4x even isn't quite enough for 1024x768 (2.56x) if the pixel size doubles too17:14
wpwrakwell, you mentioned once that 800x600 would work. so maybe there is enough room :)17:15
sb0the internal resolution is 512x51218:45
sb0that's the size of the renderer's internal texture, which is then scaled to 640x480 (or 800x600)18:46
sb0this works18:46
sb0switching to 1024x768, I also increased the internal buffer to 1024x1024. this creates a lot of slowdown.18:46
wpwrakah, i see. 1024x512 ?18:50
wpwrak4:3 is dying anyway :)18:51
sb0wpwrak, what clock recovery algorithm would you recommend?18:53
sb0right now it's resyncing at every transition18:53
sb0there's a counter running at 48MHz, and a transition resets it18:54
sb0note that the 48MHz clock is not synchronized to the 12MHz transmission clock from the device18:56
wpwrakhmm, maye some deglitching could help ? (if that's the issue)18:56
wpwrak1:4 may also be a bit low. 1:8 may give more margin of error.18:58
wpwraks/of/for/18:58
wpwrakat which phase offset to you sample ? +1 cycle ?18:59
wpwraks/to/do/18:59
wpwrakgrmbl. can't type.18:59
sb096MHz is more difficult on this slow FPGA19:00
sb0would be easier if we had a virtex or such19:00
sb0:P19:00
wpwrak;-)19:01
sb02nd cycle, just in the middle19:01
sb0yes, we can try deglitching ...19:03
wpwraki wonder if the problem could be with SYNC. it seems that the first edge of SYNC can be a bit slower (?) than the rest19:04
wpwrakat least i've seen something like this mentioned19:04
sb0meh, debian is no longer shipping gcc-avr?19:12
sb0http://packages.debian.org/search?keywords=gcc-avr&searchon=names&suite=testing&section=all19:14
sb0http://packages.debian.org/search?keywords=gcc-avr&searchon=names&suite=unstable&section=all19:15
sb0wtf19:15
kristianpaulswtich to sid :-)19:15
kristianpaulor stay in stable :-)19:16
kristianpaulhttp://packages.qa.debian.org/g/gcc-avr/news/20110708T163910Z.html19:16
sb0debian stable is for meter-long bearded sysadmins19:17
kristianpaulor people with poor-low bandwitch ;-) (me)19:18
kristianpaulah,http://release.debian.org/migration/testing.pl?package=gcc-avr19:19
wpwraksb0: wrong. it's for their grandparents ;-)19:23
wpwraksb0: do you have full-speed HID devices at home to test things with ?19:24
lars_i use debian stable19:24
lars_at least on the hardware where it still works19:24
sb0no, but I have that LV319:25
sb0I didn't know full speed HID devices existed. but why make it simple ...19:26
sb0anyway, I'm still in Norway, back tomorrow19:26
sb0wpwrak, btw what is the use case for the software controlled USB power switch? heavy-handed debugging a la norruption?19:27
wpwrakthat, power-cycling devices that don't respond to any nicer form of reset, turn off things when not needed or when causing trouble, etc.19:29
wpwrakbeing able to power-cycle is also nice for remote debugging19:29
wpwrakbeats shipping USB devices around the globe ;-)19:30
sb0i see...19:30
sb0let's add it then19:30
sb0should the switch controller from navre or lm32?19:31
sb0I can easily add two outputs to the existing GPIO controller in charge of the LEDs19:31
wpwrakcan control pass from one to the other with a soc update ? or are there routing restrictions that get i the way ?19:32
sb0in a first version, the BIOS would them on and we can forget about them for a while19:32
wpwraksure. easy does it :)19:33
sb0there are no routing restrictions, especially for such low speed signals19:33
wpwrakyou were concerned about glitches. at what time scales would they be ? < 1 us ? or longer ?19:33
wpwrakperfect. that's what i thought19:33
sb0I don't really mean a 'glitch' in the strict sense of the term - in fact, there shouldn't be any19:34
sb0what happens though is that there are weak pull ups when the FPGA is unconfigured19:34
sb0which happens at power up and when switching between SoC and standby bitstreams19:34
sb0those have to be taken into account19:35
sb0I don't like the idea of switching power to the USB device for < 1s until the FPGA is configured and then switch it off19:35
wpwrakokay, that shoudn't be too critical. we can just default to vusb off.19:35
sb0which would happen at power up if the pull up is not neutralized (assuming the switch control is active high)19:36
wpwrakwe can choose between active low or high19:36
wpwrakwe just have to tell adam that it's no longer up to his coin toss ;-)19:36
sb0ok, if we choose active high, then we don't need to neutralize the FPGA pull ups19:37
sb0and the USB devices will be off while the FPGA is reconfigured19:37
wpwrakyup. and if the pull-ups are too weak, we can easily help them.19:38
wpwrakor if they're likely to transition into Z19:38
sb0we can then drive those pins high in the standby bitstream to keep USB off19:38
sb0and when the SoC is loaded, the GPIO defaults to 0 and turns USB on, so we have nothing to do19:38
wpwraksounds good to me19:39
sb0s/choose active high/choose active low19:40
sb0of course19:40
wpwrakerr, yes :)19:40
sb0I think the FPGA pull ups are strong enough ... is that a CMOS input on the switch?19:41
sb0I don't remember the value from the datasheet, but as you can see they let pass a current sufficient to light the LEDs noticeably19:41
wpwrak1 uA leakage (max)19:42
sb0yes, no problem then19:42
wpwrakyeah. that'll be plenty :)19:42
wpwrakand now, lunch ...19:43
sb0bon appetit19:44
wpwrakmerci !19:44
sb0ah, seems you fixed the DATAx mismatch bug I sometimes experienced... thanks :)20:15
wpwrakheh, it was fun to see with just how many protocol violations one can almost get away :) well, i probably added a few of my own, so the fun probably isn't quite over yet ...20:30
sb0hmm... your first series of patches doesn't apply cleanly20:41
sb0do you still use the trigger code?20:42
sb0ah, seems you didn't merge my new SOF/keepalive generation code20:44
wpwrakno, i have that20:45
wpwrakmaybe you tripped over 4k vs. 8k ?20:45
wpwraki branched off "USB: send SOFs and keepalives on both ports and immediately after reset"20:46
wpwrakcommit f6c7474ae3b181157d8950e25c4705d53d9ae9c120:46
sb0no20:48
sb0is there a debug mode for patch? it rejects a hunk that looks totally OK for me20:49
wpwrakheh ;-)20:49
sb0it just says '1 out of 1 hunk FAILED'... no way to know more?20:50
wpwrakmaybe -l helps ? (ignore whitespace) but i usually just apply it manually if patch starts hallucinating20:51
sb0http://pastebin.com/LisFw0AK20:51
sb0ah, yes, -l helps20:51
wpwrakmystery difference :)20:53
wpwrakmaybe we'll have a trailing blank more or less now :)20:53
wpwrakah, there are some tailing tabs. maybe that's it20:54
wpwraktRailing20:54
GitHub15[milkymist] sbourdeauducq pushed 11 new commits to master: http://git.io/_DfZ1g21:00
GitHub15[milkymist/master] softusb: 4 kB hack - Werner Almesberger21:00
GitHub15[milkymist/master] softusb: use OE# of port A for trigger - Werner Almesberger21:00
GitHub15[milkymist/master] softusb: send SETUP and DATA0 back-to-back - Werner Almesberger21:00
GitHub5[milkymist] sbourdeauducq pushed 6 new commits to master: http://git.io/vOd9Rg21:17
GitHub5[milkymist/master] softusb: partially unroll usb_in - Werner Almesberger21:17
GitHub5[milkymist/master] softusb: send ACKs from dedicated inline function - Werner Almesberger21:17
GitHub5[milkymist/master] softusb: fail garbled packets fatally again - Werner Almesberger21:17
sb0wpwrak, all merged, thanks a lot!21:18
sb0http://www.smsc.com/index.php?pid=28&tid=14321:28
sb0if someone wants high speed, that could be useful... I'm not sure if and how the fpga could do clock recovery at 480 mpbs21:28
lars_mwalle: there seems to be a problem with the new uart linux code. milkymist_uart_tx_char is sometimes called although the uart tx path is still busy21:43
lars_my fixis to check whether THRE is set, and if not leave the routine right away and wait for the next interrupt21:46
wpwraksb0: (merged) thanks !22:36
--- Sun Nov 20 201100:00

Generated by irclog2html.py 2.9.2 by Marius Gedminas - find it at mg.pov.lt!