#milkymist IRC log for Tuesday, 2013-05-28

GitHub137[migen] sbourdeauducq pushed 1 new commit to master: http://git.io/TdhRug14:16
GitHub137migen/master bac62a3 Sebastien Bourdeauducq: Make memory ports part of specials...14:16
GitHub156[milkymist-ng] sbourdeauducq pushed 4 new commits to master: http://git.io/dcROOg14:17
GitHub156milkymist-ng/master fb3e612 Sebastien Bourdeauducq: Use new memory port API14:17
GitHub156milkymist-ng/master fdb021c Sebastien Bourdeauducq: dvisampler: increase frequency of reports to avoid missing WER values14:17
GitHub156milkymist-ng/master 701aac2 Sebastien Bourdeauducq: bios/linker.ld: flash -> rom14:17
_florent_Hi15:26
_florent_the Kintex7 DDR Phy + ASMICON is working ;) (at least Memtest is Ok)15:26
_florent_I'm using quarter rate commands : 200MHz mem clk and 50MHz command clk15:27
_florent_Read & Write leveling are not implemented but I will probably try to add it to run at higher frequencies.15:27
_florent_I'm now trying to run the ASMICON's ports at a multiple of the command frequency (to be able to run the CPU at more than 50 MHz...)15:27
_florent_I've added some clock enable logic in the ports, but I have a question about clock domains definition with migen:15:28
_florent_- Each Asmicon port will have it's own clock domain (a multiple or not of the command frequency), let say it use "sys_clk" for now15:28
_florent_- Asmicon is using "asmicon_clk"15:28
_florent_ My first attempt was to pass the clock_domain in the port parameter and use self.add_submodule(new_port, {"sys": clock_domain} in the get_port function15:29
_florent_and use self.add_submodule(self.asmicon, {"sys": "asmicon"}) in the top15:29
_florent_But it seems the clock_domain renaming in the top is also renaming the port (since I'm using sys_clk for the port)15:29
_florent_lekernel: do you have an idea on how to do it?15:29
lekernelgreat!15:34
lekernelso you have 1:4 serialization for the commands, right?15:35
lekerneland 1:8 for data15:35
_florent_yes like your were saying the other day15:36
_florent_it was easier in fact15:36
lekernelexcellent15:36
_florent_and I'm using 4 phases15:36
lekernelwith 8 bank machines?15:37
_florent_hmm, I have to check15:37
_florent_I just made a simple change in the multiplexer to support 4 phases15:38
lekernelDDR3 has 8 banks (as opposed to 4 in DDR), so I would assume you ran into that15:38
lekernelare all 8 banks working, or are you just using 4? ;)15:38
_florent_that what I have to check :)15:39
_florent_at least I've changed ba width from 2 to 315:40
lekernelclock domain remapping in add_submodule does the remapping for the module and all its submodules, yes15:41
_florent_ok thanks15:42
lekerneland btw you can use the shorter form: {"sys": foobar} => foobar15:42
lekernelsys is implied by default15:42
lekernelI wonder if ASMI can really meet timing when you have a lot of ports... I'm bumping into issues on the slowtan6 video mixer atm15:43
_florent_for now my port won't run asmi at more than 50MHz, so it should be ok15:44
lekernelmaybe I need to simplify the architecture a bit, eg just have a crossbar switch to the parallel bank machines15:44
lekernelthis can also make it easier to have multiple memory controllers15:45
_florent_what is the critical path on the asmi?15:45
lekernelthe hub management15:46
lekernelso I want to replace that hub with a crossbar switch, and not have split transactions anymore15:46
_florent_it can also interesting to have ports with integrated async fifo and / or bus width adaptation15:48
lekernelthere can't be page hit optimization reordering anymore, too - only read/write turnaround minimization reordering, and parallel bank commands15:48
lekernelhmm, the problem is - how do you make that async fifo generic enough?15:49
lekernelhow would you apply that to e.g. a framebuffer?15:49
lekernelrun all the logic on the pixel clock?15:50
lekernelwith just two async fifos into the system clock domain to send memory read commands and get the results?15:50
_florent_the idea was just to be able to have ports with different frequency than the asmicon15:51
_florent_instead of having the fifo in the framebuffer as it is now, having running @ pixel_clk15:52
lekerneldifferent frequencies = more latency, more chances for non deterministic bugs that maximize time wastage, simulation difficulties that maximize time wastage even more15:53
lekerneland how does the framebuffer communicate with the cpu?15:54
lekernelto set scan address, video timing parameters, etc.15:54
_florent_yes on this point you have to resynchronize all signals15:55
lekernelhave some clock domain transfer support in csrgen?15:55
_florent_why not ;)15:56
lekernelyeah, could work...15:57
lekernelbut15:57
lekernellet's say we want 1080p15:57
lekernelthen we have to run relatively large amounts of logic at 148MHz, which is, as I know so well, a royal pain in slowtan615:57
_florent_yes but the framebuffer is maybe not a good example for that15:58
_florent_if you have 1 Asmicon + N totally independent cores that need memory accesses15:59
lekernelI'd try to run everything on one single clock. minimizes memory latency and headaches.16:00
_florent_my idea was that it's easier a generic port that can run at the core frequency instead of doing all clock domain crossing directly into each core16:00
lekernelwhy do all those cores need different clock domains?16:01
larscbecause they can ;)16:01
_florent_;)16:04
_florent_Imagine you have video multiplexer, 2 SD inputs, 2 HD inputs, 2 SD outputs, 2 HD outputs, you want to be able to redirect each SD input to each SD output, same for HD, I find it easier to have async port than have to handle clk domain transfer in each port16:08
_florent_but anyway, for now I only want to be able to run the CPU at more than 50 MHz16:08
lekernelI'd rather implement read/write leveling than waste time on hacking asynchronous ASMI ports16:09
lekernelWL should be easy if Xilinx got the calibrated IODELAYS right in the 7 series16:10
lekernelfor RL you just need to use DQS for reading16:10
lekernelI recommend you do a small soft FIFO that can store two bursts (ie 16 bits deep)16:10
lekernelthen for data recapture just read the FIFO with the worst-case delay16:11
lekernel(read in the system clock domain)16:11
lekernelthere's only one annoying detail, you won't do a FIFO with DDR registers16:12
lekernelso you need a IDDR16:12
lekerneland the last data pair will get stuck in it16:12
lekernelto solve this I propose the controller issues one dummy reads whenever there is a "bubble" in the read flow, to make the DDR toggle DQS and clock the data out of the IDDR and into the FIFO16:13
lekernelthe dummy read is easy, just repeat the last read command immediately - it's guaranteed to be a page hit that will produce a continued burst16:14
_florent_ok thanks, I remember we discussed about that before, now that I have something working on board, it will be easier to work on it16:17
lekernel)=(//(! thunderstorm17:52
lekernelI have Pearson correlation coefficients of 0.56, 0.53 and 0.78 between wer0/wer1, wer1/wer2 and wer2/wer018:39
lekernel7K samples18:41
Alarm_I do not see Xiang fu on IRC and he does not respond by email?18:42
larsclekernel: what are werX?18:43
lekernelnumber of noncontrol words with too many transitions received during the last 2**24 words18:44
lekernelX = channel number18:44
larscah18:44
lekernelcould be clock skew/glitches/failure?18:44
larscAlarm_:  <@qi-bot> larsc, xiangfu (~xiangfu@123.113.243.136) was last seen quitting #qi-hardware 12 hours 54 minutes ago18:45
lekernels/skew/jitter18:45
Alarm_larsc: OK thanks18:47
larsclekernel: so this means there is quite a bit of correlation, right?18:59
lekernelI'm not a statistics expert, but I'd think so19:00
lekernellol: removing the DCM_CLKGEN, which supposedly provides better clock jitter tolerance than the PLL, results in WER=0 and no more picture noise19:48
lekernelguess I just have to sort out the memory speed issues now, and the video mixer will be perfect :)19:49
wpwrakjust don't use anything xilinx recommend :)19:49
wpwrakregarding the correlation, we already know there's a strong long-term correlation (the temperature dependency)19:53
lekernelthat was correlation between the error rates on each channel, which suggested a clock problem19:54
lekernelor some other source of noise that would affect them all at the same time19:54
wpwrakyes. the temperature issue showed that too (without explaining the underlying problem, though)19:56
lekerneloh, I can already do 720p at WER < 5 :)19:56
lekernel1280x72019:56
lekernelnot bad for that add on board19:56
wpwrakadd a negative DCM_CLKGEN and it'll be perfect ;-)19:56
wpwrakindeed. you're way above any frequency such a contraption can reasonably be expected to handle19:58
lekernelhmm, perhaps I can even have 1080p24 on the inputs - which is a HDMI standard - and 1080p60 at the output20:48
lekernelthat's a bit more than 8Gbps memory bandwidth, challenging but maybe doable20:48
lekernelso I guess next step is to fix ASMI20:49
lekernelah, no it's 12Gbps bandwidth. won't work :(20:50
lekernelmaybe if I output 1080p30 or 24 - don't know if monitors accept it from VGA ...20:51
mwallehi20:53
lekernelhi mwalle20:54
--- Wed May 29 201300:00

Generated by irclog2html.py 2.9.2 by Marius Gedminas - find it at mg.pov.lt!