#milkymist IRC log for Sunday, 2012-03-18

Action: Fallenou disactivated irq, removed anything not needed from the bios, uses uart with "force_sync" mode, flushes D-cache after each disable_mmu(); and it still crashes right after test n°101:58
Action: Fallenou going to sleep, will bang his head against the wall a bit more tomorrow on this02:04
Fallenoun8 !02:04
wolfspraulsleep does magic, tomorrow you have a new idea :-)02:05
wpwrakwriting an MMU must be as much fun as writing a compiler. the latter is quite traumatizing. for years after, your first thought on any program crashing/exhibiting a problem is that it must be a compiler bug.02:20
wolfsprauldon't disturb his dreams02:26
lekernelFallenou: simulate it... you can't see anything now10:01
Fallenoulekernel: is the HEAD of milkymist-ng functional ?13:13
FallenouI would like to update my copy, to get the uart to simulate my minimalist bios13:13
lekernelyes, except sdram of course13:36
Fallenouok thanks13:37
lekernelyou can try to simulate it first (with the migen simulation extension)13:37
lekernelthis involves fixing icarus though13:37
FallenouI was planning on simulating it with Isim13:37
Fallenouso when I write something in the uart it will $display() it ?13:38
lekernelyou can do more than that - redirect UART both ways to a PTY13:38
Fallenouhum nice but I can't do that with Isim, can I ?13:39
FallenouI don't need to type anything so I think it's too much for what I need13:39
lekernelno, but fixing icarus would be nice to do anyway13:39
Fallenouif I can do without fixing Icarus it won't be in my todo list13:40
FallenouI only need uart output13:40
lekernelbesides being incompatible with migen, isim is proprietary and bloated13:40
Fallenou$display in Isim would be enough13:40
FallenouI don't mind if it's proprietary, even if it kills kittens, I just need it to debug my code13:40
GitHub169[milkymist-ng] sbourdeauducq pushed 1 new commit to master: http://git.io/SRkzIw14:04
GitHub169[milkymist-ng/master] asmicon: move slot time to timing settings - Sebastien Bourdeauducq14:04
Fallenouto be more exact, I do mind if it's proprietary, I just don't want to spend too much time on something else than the mmu itself, just to focus. I will try to use tools that work today, if it's not enough I will try icarus but I first try with things that work today14:06
lekernelwell, it seems to me that having excellent simulation tools is important for this task14:08
lekernelotherwise you'll just spend a lot of time with frustrating failing experiments on the hardware14:08
FallenouI do simulations, don't worry, I test on hw when I'm confident that it could work because I've seen it working on the simulation14:11
Fallenoubut sometimes I'm wrong and then I indeed lose time on the hw14:11
Fallenouand then I go back in simulation14:11
lekernelalso, if you have the UART, the libc, etc. in the loop, it becomes quite a mess14:19
lekernelwhat simulations have you done so far?14:19
stekernor then you have one of those fun occasions where it works in simulation but not in hardware, even if you're doing the same thing in both14:19
lekernelstekern: if you don't use X, blocking assignment, or fancy asynchronous stuff, it shouldn't happen14:20
lekernelwhat often happens, though, is incomplete test coverage14:21
Fallenouwhat's self.rxtx.re ?14:21
Fallenouthe .re14:21
lekernelmeans that the register has been written from the bus and the device (uart) should read it14:21
stekernif you're not hitting some synthesis specific bug, that could of course be determined with post-synthesis simulations14:22
lekernelFallenou: have you done simulations for all possible cases of the pipeline control signals?14:22
Fallenouok I need to add a display() there14:22
Fallenoulekernel: maybe not all but I generated 16 test cases, all are passing in simulation, none on hw14:22
lekernelFallenou: what do you mean, "test case"? test program?14:23
Fallenouso I wanted to start on understanding why it fails14:23
lekernelFallenou: and do these programs generate all sorts of pipeline and bus timings?14:23
lekernelFallenou: does the TLB work correctly in the presence of memory latency?14:23
Fallenouit's very minimalist test for the moment14:24
Fallenouit's just a combination of sw lw and nop14:24
Fallenoubut I will add more in the future14:24
Fallenouwith more instructions and combinations14:24
Fallenoubut since those are not passing ...14:24
lekernelFallenou: you don't only need to check instruction combinations. what's really important is the internal pipeline control signals.14:25
Fallenou15:28 < lekernel> Fallenou: does the TLB work correctly in the presence of memory latency? < well in theory yes, I do mind the stall* signals14:25
Fallenoustall_x stall_m14:25
lekernelyes, but have you run simulations that show clearly that the whole CPU behaves correctly when those signals are asserted in the TLB?14:25
lekernelthis would be the nr. 1 source of bugs imo14:26
Fallenouat first it was indeed a source of bug14:27
Fallenoubut I think that now it's OK14:27
FallenouI was getting different signal results if I was accessing an already cached data for example14:27
lekernelwhen you have a bug, that's usually because you have thought it was OK before :)14:27
Fallenouor if I was accessing a not already cached data14:27
Fallenoubut now it's ok14:27
lekerneland? are you sure there are no such critters anymore?14:27
Fallenouwell I must have missed something, since it's failing :)14:29
Fallenouso I am trying to get a more "close to real hw" simulation14:29
Fallenouto identify to issue14:29
lekernele.g. you could have 4 test benches, each showing proper functionality under each combination of (stall_x, stall_m)14:29
Fallenoufor example a simulation with uart and irq14:29
wpwrakFallenou: does it fail at the "t" of the very first test ? or later ?14:52
Fallenouyes the t14:52
Fallenouit prints the t and nothing more14:53
wpwrakdo the asm statements create loads/stores before or after them ? i.e., between the test code and disabling/enabling the MMU ?14:54
Fallenoubefore yes14:55
Fallenouwell at first I did something like enable_mmu() generate_test(i, j); disable_mmu()14:55
Fallenouand to avoid code begin generated between enable_mmu() and the test14:56
FallenouI merged the mmu activation inside the generate_test()14:56
Fallenouyou can see the first 3 asm instructions of generate_test(i,j) are actually the mmu activation14:56
wpwrakbut not the deactivation. maybe it's that ?14:56
Fallenouand 3 nops after that just in case14:56
lekernelare you sure the "for" loops don't generate load/stores?14:56
Fallenoulekernel: the code you are looking at is not running on the lm32, this code generates the test code14:57
Fallenouthere is no loop in the real test code14:57
Fallenouwpwrak: let me see !14:58
wpwrakstupid question: if you remove the enabling of the MMU, does the test run to completion (with 100% failure then) ?14:58
Fallenouok no there is no code generated by compilater between the load/store of the test and the disable mmu14:59
Fallenouwpwrak: hum good question actually :p15:01
Fallenoulet's do the test15:01
wpwrakand don't you need a clobber for r0, too ?15:01
FallenouI never modify r0 :o15:01
Fallenou%0 means the first register15:01
wpwrakyou xor it  lot15:01
Fallenour0 is always 015:01
Fallenouit's meant to be that way15:02
wpwrakah, i see. so these are nops15:02
Fallenouyes :)15:02
Fallenouthose nops are not needed in theory, I added them in a desperate attempt to make test pass15:03
wpwrakanother thing to try would be to remove the stores. maybe your test is overwriting the program or such15:03
wpwrakhehe ;)15:03
Fallenouok nice two tests to do15:03
Fallenouwill do that15:03
Fallenouthanks !15:03
wpwrakof course, in the end you'll find it's a comma missing somewhere or so. all the viciously complex bugs have a trivial cause :)15:04
Fallenouhehe I hope you're right =)15:05
lekernelthis is running from flash, so a misplaced write could put the flash into "status" mode15:05
Fallenouoh !15:06
Fallenouremoving the three assembly lines that activate mmu makes the lm32 run through the test code like a charm15:06
Fallenouhum hum15:07
wpwrakgood. so the basic structure is good.15:07
wpwraknow let's see what the stores do or don't15:08
FallenouI was trying with only 1 test and no printf/puts, now I did with all the 15 tests and printf puts, it's writting "PASS" and so on15:10
FallenouI removed almost all the bios15:10
Fallenounow it's just doing init_uart() dtlbtest(); while(1) puts(".");15:10
wpwrak(PASS) that is still without enabling the MMU ?15:10
Fallenouso I see my tests and then a lot of "....."15:11
Fallenouyes without MMU, tests are not very well designed15:11
Fallenouthey pass even without mmu activated15:11
wpwraksweet ;-)15:11
Fallenouhehe yes it's a serious issue15:11
Fallenoubut if mmu is activated and they still pass it's a good news :)15:11
Fallenoubut yes they should fail without mmu15:11
wpwrakor maybe the mmu senses the danger of failure and auto-activates to save you ;-)15:12
wpwrakoh. so something solved the troubles you've been having ?15:12
Fallenouit's not hard to correct the test anyway, not a big deal15:13
wpwrakbut you're saying the test that used to crash/hang now doesn't ?15:14
Fallenou16:17 < wpwrak> oh. so something solved the troubles you've been having ? < no, test "pass" without mmu, but if I uncomment the mmu activation it still crashes15:14
wpwrakah, i see15:14
Fallenouyes test does not crash, but mmu is not activated15:14
wpwrakokay, let's see what happens if you enable the MMU but take out the stores15:15
Fallenoufirst test fails and the other pass15:17
Fallenoubut mmu is still disactivated15:17
Fallenouoh wait15:17
FallenouI did not update my test code ...15:17
Fallenouok !15:20
Fallenouso now, with mmu activated, no store, all test RUN, and FAIL15:21
Fallenouand no crash :)15:21
Fallenougood news !15:21
Fallenouwell I think ...15:21
wpwrakthe noose around the bug's neck is tightening :)15:22
Fallenouso either the store is writting at a flash address15:22
Fallenouor it's writting to 0 or around15:22
Fallenou(because of the mmu doing a wrong translate)15:22
wpwrakcan you make the MMU run but not change the address that gets sent on the bus ?15:22
wpwrakand perhaps write the translated address to a debug register where you could then look it up ?15:23
Fallenouwhen I say "mmu disactivated"15:23
Fallenoummu is there, running, but it does not replace the virtual address by the physical one15:23
lekernelwpwrak: that's for this sort of thing that there's simulation...15:23
wpwrakah, perfect. can you thus store the translated address in a debug register ?15:24
lekernelyou can just print it :)15:24
FallenouI could just add a CSR for that !15:24
wpwraklekernel: seems that he already tried simulation and didn't see the problem15:24
wpwrakFallenou: yeah :)15:24
lekernelyes, but not with this program15:25
Fallenouwith the same crt0.S and the same function running15:25
Fallenoubut not the same path between the crt0.S and the function15:25
Fallenoubut the assembly code of the test was the same15:25
Fallenouand in Isim it was working15:25
Fallenouit was writting to the correct physical addresses the correct values15:26
Fallenoufor all tests15:26
FallenouI added a display(); in soc.v when |frag_we is true to print address and value of stores15:26
Fallenouok now let's synthetise ...16:08
Fallenouhum hum17:17
FallenouI run the tests with mmu disactivated, after a store word I do a rcsr %1, TLBDBG (to get the result of previous tlb lookup)17:18
Fallenouand it prints 0x4800000017:18
Fallenouit should be 0x44000000 :o17:20
Action: Fallenou goes back to simulation =(17:22
Fallenoufirst let's see if the new csr is working properly17:38
Action: kristianpaul had crashed the soc before doing silly things inside the csr17:41
Action: Fallenou talking about csr inside lm3217:42
kristianpaulah !17:42
kristianpauloh i need check that too ;)17:42
Fallenouhum the simple fact of adding the rcsr crashes the cpu :)18:15
wpwrakFallenou: i thought that worked ? or how else did you obtain the 0x48000000 vs. 0x44000000 before ?18:30
Fallenouwell on real fpga I get 0x48....18:31
Fallenouon isim the cpu just stops working18:31
Fallenoudoesn't make any sense18:32
wpwrakfunny :)18:32
wpwrakthey take turn in who gets to fail horribly18:33
Fallenoumaybe it's a bad idea to use the csr "0x1F"18:36
wpwrakwhy ?18:39
Fallenouwill try with the tlbctrl18:41
FallenouI was only using the wcsr on it, the rcsr is free18:41
Fallenouhehe, it works with another csr18:41
wpwrakwait .. 0x1f for csr_addr ?18:41
wpwrakisn't that 4 bits ?18:41
Fallenouthe csr id18:41
wpwrak0x1f may overlap with USB (0xf aka 4'hf)18:42
lekernelthat's a different sort of CSR18:43
Fallenoucsr inside lm32 is on 5 bit18:43
wpwrakoh :)18:43
FallenouI get non-aligned memory access ... hum hum !18:43
Fallenoubbl after diner !18:45
Action: Fallenou is back19:12
FallenouI must have done somethign wrong with the rcsr stuff19:52
wpwrakprobably. in the movies, the self-destruct mechanism always has a countdown. that seems to be missing here.19:55
Fallenouhere you just need to execute a rcsr r11, TLBCTRL , and then count up to 0 and boom :)19:57
wpwrakactually, kristianpaul should be interested in your code. he's been looking for CSR reads with a side-effect. well, on the other CSR, but maybe some concepts can be shared :)19:59
Fallenousure I would be happy to make his project crash !20:01
Fallenoumore seriously adding a csr seemed quite simple20:02
Action: Fallenou wonders what's wrong20:02
kristianpaulseemed !20:02
kristianpaulthats what all we tought at first ;-)20:02
stekernfamous last wrods ;)20:02
wpwrakif you don't read the TLBDBG, does it still crash ? i.e., is the problem in the CSR access or in storing the lookup result in TLBDBG ?20:05
kristianpaulany bad thing on this makefile http://pastebin.com/DV4kkSYA for getting can not find -lbase and annot find -lhal ?20:09
lekerneldo you have those libs in $(MMDIR)/software/libbase or $(MMDIR)/software/libhal ?20:11
kristianpaulyes i do20:13
kristianpaulthats the odd thing..20:13
lekerneland have they been compiled correctly?20:14
lekerneland built with the correct ar?20:14
kristianpaullet me check20:14
Fallenouwpwrak: if I don't do rcsr TLBCTRL it does not crash, if I do it : it crashes :/20:19
lekernelFallenou: what did you modify to add that csr? I can't find that patch in your commits ...20:20
kristianpaulnoob question, do i need libhpdmc for something in a baremetal app that access sdram?20:24
lekernelif it's booting directly from flash (not through the bios), yes20:25
lekernelalso note that executing from flash is slower than from the sdram20:25
kristianpauldo you have an eskelenton of an app execuded after bios load it too ram?20:27
kristianpaulor  a simple c hello world should work i guess20:27
lekernelyes, there was the demo firmware20:27
Fallenouit's pused now20:27
kristianpaulwhere is it right now? :-)20:27
kristianpaulok i was ignoring this flash/sdram difference and here my suffering..20:28
kristianpaulbut what would be proper? let bios do initialization stuff20:28
kristianpaulthen load the other app from serial/net and work on ready to use ram?20:28
lekernelhere your suffering. poor kristianpaul :)20:29
kristianpaultake as note i used to exagerate a bit on my words ;)20:30
sh4rm4Fallenou, you shouldnt mix printf with puts20:31
sh4rm4glibc for example will sometimes mess up the order20:31
Fallenouhere there is no difference20:32
kristianpaulhey but  i stil can put that "other app" in the harcoded partition for FN and it should boot correct?20:32
Action: kristianpaul dont want mess with net/serial boot right now20:32
lekernelsh4rm4: weird - sometimes gcc replaces printf() with puts()20:33
lekernelwhen there are no format specifiers20:33
sh4rm4i had the impression that puts uses write() while printf uses fwrite()20:34
lekernelFallenou: are you aware that it would break if debug is disabled?20:34
wpwrakFallenou: and how about TLBDBG ?20:34
Fallenouwpwrak: I thought csr 0x1f was not safe to use (but maybe it is) so I use tlbctrl now20:35
FallenouI removed tlbdbg (0x1f) to use tlbctrl20:35
lekernelFallenou: ok. either way, there's a problem if the user disables CFG_DEBUG_ENABLED20:36
Fallenouoh yes20:37
FallenouI should add something to force width 5 if mmu is enabled at synthesis time20:38
GitHub35[milkymist-ng] sbourdeauducq pushed 1 new commit to master: http://git.io/6uYMqQ21:17
GitHub35[milkymist-ng/master] asmicon: multiplexer (untested) - Sebastien Bourdeauducq21:17
GitHub11[migen] sbourdeauducq pushed 1 new commit to master: https://github.com/milkymist/migen/commit/d47b564fad08eb5daeade79e752c42e0d14f2a8921:19
GitHub11[migen/master] corelogic/fsm: typo - Sebastien Bourdeauducq21:19
lekerneldraft of the new SDRAM controller is complete... now leeet's fix a hundred bugs ...21:23
Fallenouhehe have fun !21:23
Action: lekernel scratches is head about the best testing strategy for this little mess21:25
kristianpaulhow it can be complete having bugs? :)21:45
kristianpaulah,  those bugs dont belong to the controller !21:46
mwallekristianpaul: yeah i some faster uart working, but that havent brought the boost i thought22:17
mwallejust got 50kbs on gdb22:17
mwalleone major problem is the async character of the uart, you have to use the LCM of the baud settings the ft2232 and mm can run at22:19
mwalleso without some redesign of the uart core youre stuck at 4mbps iirc22:20
Fallenouwhen the cpu "crashes" while doing the rcsr, I can see that wishbone D_STB_O and D_CYC_O are asserted, and stay asserted indefinitely, they never get acked (by D_ACK_I)22:46
Fallenouthat must be the reason for the system hang22:46
Fallenouand D_ADR_O is 0xfffff004 (o_O ?)22:47
wpwrakFallenou: maybe you can route such signals to LED1/LED2 (B16/A16) in hardware. that way, you can observe visually if they suddenly go DC22:57
Fallenouto check if it's the same issue on hw ?22:57
FallenouI wonder why a rcsr starts a wishbone transaction22:57
wpwrakthat too, in case you still have that problem. i thought of it more as a means to monitor activity. it's not uncommon that you can actually see an unusual pattern22:58
wpwraklike in the old days, you could hear from the FM interference if your home computer was stuck in a busy loop :)22:59
Fallenouhehe :)22:59
Fallenouactually I had such an experience, when I was scrolling in some software, it was disturbing the radio23:00
wpwrak(csr vs. wb) wasn't there an address length change ? at least this sounds like it "Fallenou> I should add something to force width 5 if mmu is enabled at synthesis time"23:00
wpwrakor are the two completely separate ?23:01
Fallenouthe length we were talking about is the number of bits reserved for the csr id in lm32 opcode rcsr and wcsr23:01
Fallenouit's either 3 or 4 or 5 bits23:01
Fallenouand it's only inside lm32 design23:02
Fallenouit never goes out23:02
Fallenouit's internal registers23:02
Fallenouit has nothing to do with "CSR" on the SoC23:02
wpwrakhmm, then it's inded a bit mysterious23:02
Fallenouyes I just don't get it23:02
wpwrakFPGA-internal crosstalk ? :)23:02
Fallenouhehe I don't believe that23:02
Fallenouto I guess cpu is stuck because no one ACK the wishbone transaction23:05
Fallenoubut why nobody acks it ?23:05
Fallenoudoes the wishbone arbiter check address boundaries ?23:06
wpwrakcan you see if the WB transaction starts before or at the CSR ?23:07
wpwrakbtw, do the two CSRs really have to have the same name ? that's horribly confusing23:08
Fallenouassign frag_slave_sel = (wishbonecon0_wishbone_adr_o[28:26] == 3'd0);23:08
Fallenouhum hum !23:08
Fallenouindeed with 0xfffsomething it's hard23:08
Fallenouwpwrak: yes it's better to have a csr just for our debugging, I did that at first, and then seeing everything crashing I tried with an already existing one23:09
FallenouI should go back to tlbdbg23:10
Fallenouso wtf is this 0xfffff004 now :o23:10
wpwrak(csr) i mean the buses - the one inside the core and the one outside. just imagine someone trying to learn about the architecture. they's go nuts trying to figure out all the contradictions in the documents.23:11
wpwrakcan you see when the 0xfff... first appears on the bus ?23:12
Fallenouwell there is the store word23:13
Fallenouand right after this transaction, another one begins (the wrong one) with D_ADR_O == 0xffff***23:13
wpwrakso it's not related to the rcsr ?23:15
Fallenouin the pipeline you have nop nop nop sw nop rcsr nop lw23:15
Fallenouwhen the store is executing PC_m (program counter of memory stage) == 0x25C which is the address of the sw instruction23:18
wpwrakso at that point in time already some decoding of rcsr as happened ? what happens if you add a few more NOPs after this sw ?23:19
wpwrak(but not after any that may come before)23:19
Fallenouwhen the weird transaction starts PC_m == 0x26C23:19
Fallenouoh, this is the lw23:19
Fallenouthat explains the wishbone access :)23:19
Fallenouso this gives one point to lekernel23:20
FallenouI just hit a case with weird pipeline signaling (with the rcsr)23:20
Fallenouso I just translate the address badly => 0xfffff00423:21
Fallenoufor my defense, it's totally weird, the load_q_m signal is not asserted ... it should be asserted while doing a load word :o23:22
wpwraklame excuse :)23:25
Action: Fallenou had to give it a try23:26
Fallenouok it's dcache_select_m fault, it just drops23:29
Fallenouleading to load_q_m not rising23:29
Fallenouarg ok the address is higher than max address cachable23:31
Action: Fallenou head bangs against the wall again23:31
wpwrakso you tried to load from an inaccessible virtual address ?23:34
Fallenouwell I don't know what happened but the address_x signal gets the value 0xfffsomething23:35
Fallenouand I have set up max_dcache_addr == 0x7ffff...23:35
Fallenouso as soon as load_store_unit sees address_x >= max_dcache_addr, it unassert dcache_select_m and there is no load_q_m in dcache module23:36
Fallenouand no mmu translation23:36
FallenouI'm wondering if we really need this max_dcache_addr, since we cache all the RAM23:36
Fallenouor maybe there is something memory mapped above the ram23:38
Fallenouoh yes ok :(23:38
wpwrakpoor little bug. it was hiding to well. yet you still found it.23:41
Fallenouok I got it ...23:41
Fallenouthe asm code is wrong23:41
Fallenoutotally wrong23:41
FallenouI do rcsr r2, TLBCTRL and then lw r14, (r2+0)23:41
Fallenouso indeed if the rcsr returns shit, it will load from address "shit"23:42
Fallenoubut why is r2 used ? I put a different %k register in the asm() statement23:42
Fallenoulet me show you23:43
wpwrakhah. and that explains why the rcsr changed things23:43
Fallenoudo you understand this ?23:44
Fallenouwhy does it put r2 for %1 and %3 ?23:44
wpwraka bit odd indeed. i don't quite remember all the asm constraints, though. maybe there's some special case. lemme see ...23:46
Fallenouall the variables are "register unsigned int"23:47
wpwraki think you need &23:48
Fallenouoh oh nice link23:51
FallenouI just don't understand the sentences23:51
Fallenouso you say I just do "=&" ?23:51
FallenouI should*23:52
wpwrakthe basic assumption seems to be that "asm" is typically used for a single CPU instruction. so that "early clobber" would be something unusual. that's probably why it's not the default23:52
wpwrakyes, =& looks good23:53
Fallenouhum ok single cpu instruction23:53
Fallenouweird assumption :'23:53
FallenouI tried with =&r for the tlb_lookup variable, it still generates only r2 register23:54
wpwrakwhen they started gcc, computer memory probably didn't have room for much more ;-)23:54
Fallenouoh no it's ok23:54
Fallenouit's still using r2 for rcsr23:54
Fallenoubut lw r14, (r14+0)23:54
Fallenougood good23:54
wpwrakgood :)23:54
Fallenouawesoem, thanks !23:55
wpwrakthat would have been my next question :)23:55
Fallenouok so it simulates well now23:56
Fallenouno more cpu crash \o/23:58
wpwrakkewl !23:59
Fallenoutime to go to bed23:59
Fallenouhave to work tomorrow morning :'23:59
--- Mon Mar 19 201200:00

Generated by irclog2html.py 2.9.2 by Marius Gedminas - find it at mg.pov.lt!