#milkymist IRC log for Monday, 2012-07-23

larscmwalle: hm, the fix doesn't work. I've now added explicit mv instructions for the syscall registers. This generates quite a bit of overhead, but well it works for now.19:57
larscBusyBox v1.18.5 (2011-12-27 18:34:58 CET) hush - the humble shell19:57
larscEnter 'help' for a list of built-in commands.19:57
larsc# uname -a19:57
larscLinux (none) 3.5.0+ #2412 Mon Jul 23 21:58:10 CEST 2012 lm32 GNU/Linux19:57
kristianpauland the other busybox utils works?20:17
larscno mmu yet, though ;)20:24
Fallenoureally ? no one did a mmu ?20:27
Fallenousounds trivial to do  :o20:28
Action: Fallenou returns to its pipeline drawings20:29
kristianpaullarsc: GREAT !20:32
kristianpaulso i can run my silly apps on uclinux now :)20:32
mwallelarsc: does it work reliable? even in qemu?20:42
larscmwalle: I only tested in qemu so far20:42
sh4rm4you can build uclibc from the official kernel sources ?20:43
mwallelarsc: did you spawn some processes?20:43
larscmwalle: only a few20:43
mwalleuclinux? isnt that really ancient? :)20:43
larscbut i can run a while `true`; do data; done20:43
mwallelarsc: mh ok, cool ;)20:44
sh4rm4so this is a full linux ?20:44
mwallesh4rm4: no shared libs yet20:44
mwallesh4rm4: only static flat binarys20:44
Action: kristianpaul no cares as soon have a serial port and a working busybox20:45
kristianpaulmwalle: but it can run static elfs?20:45
sh4rm4i didn't know you can build official linux without mmu20:45
kristianpaulor i need embeded custom apps in owrt build?20:46
mwallekristianpaul: that worked for years now :b20:46
kristianpaulhe, asking just in case :)20:46
mwallebut there was still a bug causing linux to abort in qemu (and hanging in real hw i guess)20:47
mwalleand i guess signals arent working completely20:47
kristianpaulwhat that last means?20:48
mwalle(because thats still theobroma code)20:48
mwallelarsc: correct me if im wrong ;)20:48
larscI added a bug to it as well, when i cleaned it up20:49
mwallekristianpaul: dunno, timers/alarms may not work20:49
larscctrl+c works fine20:50
larscso signals are at least somewhat working20:50
mwallewpwrak: (tlb miss handler) my idea was to distinguish between itlb and dtlb by the control word you write in the control reg20:50
kristianpaullarsc: what about mmu-less malloc?20:51
mwalleTLBVADDR TLBPADDR would be shared between both20:51
mwallekristianpaul: sbrk should work20:51
mwallekristianpaul: of course theres no memory protection between tasks20:52
mwallelarsc: btw i guess i should push my elf2flt patches upstream..20:52
larscanonymous mmap also works20:53
mwallewpwrak: i dunno if its really worth the hassle to implcitly update xTLB on TLBPADDR write20:56
mwallewpwrak: btw does unlikely work with gcc-lm32? :)20:56
mwallemh http://pastebin.com/ZUfrWs2s21:04
mwalleah forget it ;)21:05
mwallestill strange: http://pastebin.com/tVAvBqWN, likely() produces worse code than code without annotations21:18
Fallenouwhat's __builtin_expect() doing ?21:22
sh4rm4it optimizes code for the likely branch21:25
Fallenouoh ok, you as a developer tell gcc what branch you think is most likely to be taken21:25
sh4rm4yep, like p = malloc()... if (unlikely(p)) { ...21:26
sh4rm4oops, !p21:26
Fallenouok :)21:26
Fallenougot it, thx21:26
wpwrakmwalle: (likely/unlikely) interesting :)21:27
wpwrakmwalle: (itlb/dtlb) if you put it in the control word, this may add a few more checks and branches21:29
wpwrakmwalle: (hassle) dunno if it's a hassle. if it's easy, you save one CSR write. cyclesss are (my) precioussss ;-)21:30
FallenouI agree we should optimize tlb miss as much as possible21:31
Fallenouas they will happen a looot21:32
FallenouI quickly had a look at your email wpwrak it sounded great, even if I didn't get everything21:33
mwallewpwrak: so we need at least an own tlb miss handler22:35
wpwrakyeah, i'd say so22:36
mwalleand since we dont want to distinguish between itlb and dtlb within the handler code, we need either make the hw remember the current tlb or we need two exceptions22:37
mwallethen i vote for the second22:37
mwalleto keep hw simple22:37
wpwrakFallenou: i'm glad you like it :) and yes, while writing, i noticed how it was gradually getting trickier :)22:37
wpwrakmwalle: we could reuse Fallenou's idea of indicating the TLB with a bit in the address. e.g., VADDR[0]. that way, just writing/keeping the fault address would select the correct TLB.22:38
mwallemh .. but if we need to set some magic bits in the paddr... thats one more instruction22:39
wpwrakmwalle: otherwise, yes, separate handlers are easier than fancy status bits22:39
mwallewe could use wcsr TLBCTRL, r1 too..22:39
wpwraki would have the magic in the VADDR. use free PADDS bits for permission and such22:39
wpwrak(PADDR) because the PADDR can come straight from the page table. which is where all the other bits live, too22:40
mwallewpwrak: the thing is, with the bits magic, PADDR and VADDR are two seperate registers for both TLBS and you select between them with the lowest bit but you have to set that bit for both PADDR and VADDR22:41
wpwrakso the PADDR would be sort of a union { unsigned page_addr:20; unsigned flags:12; }22:41
wpwrakwhy ? have one set of registers and only use VADDR[0] to select the TLB22:42
mwalleah, yes ;)22:42
wpwrakwhen entering the TL miss handler, VADDR is already pre-set, including VADDR[0]22:42
wpwrakthen, with my "magic" PADDR write, you just fetch the page table entry and write to update. very simple :)22:43
mwalleok agreed, then one exceptions is enought, isnt it?22:43
mwallebecause the handler dont need to distiguish between instruction and data tlb22:44
wpwrakshould be, yes. as far as i can tell, there's nothing really different between them anyway22:44
wpwrakoh course, while implementing things, surprises may surface :)22:44
mwalleand BADADDR should be the same CSR number as VADDR22:45
mwallewith the lowest bit set or not?22:45
wpwrakaye. and more than that, it should be the same register. not one being a read and the other a write register that have nothing in common22:45
wpwrakif we use the (VADDR[0] ? I : D)##TLB approach, yes22:46
mwallemasking that bit out, doesnt cost us an instruction, right? because we need to shift anyway?22:47
wpwrakwe never look the lower 12 bits. that is, unless we run into trouble. but that's no longer in the fast path22:47
wpwraktrouble = segfault / oops22:47
mwalleok so to conclude, we have an TLBVADDR, which is read/write (read for BADADDR), VADDR[0] indicates tlb, writes to PADDR triggers a  TLB update22:50
mwalleTLBCTRL is only needed for invaldiation/flushing22:50
mwallereading PADDR should return the TLB entry for a given VADDR imho22:51
mwalleshouldnt be too hard in H/W ;)22:51
Fallenou00:50 < wpwrak> when entering the TL miss handler, VADDR is already pre-set, including VADDR[0] < the one you read from, not the one you write to22:53
wpwrakhmm. do we have a read strobe ?22:53
Fallenouif you read from VADDR, you get the address causing the miss22:53
wpwrakFallenou: i would make the two the same22:53
Fallenoubut behind the scene you have two different registers22:53
Fallenouone you read from (faulty address)22:53
wpwrakFallenou: have one VADDR register. if there's a fault, you write the fault address to it22:53
Fallenouone you write to (to set up a mapping or invalidate a line)22:54
Fallenouwell, ok it's possible I think :)22:54
Action: Fallenou 's brain is heating up reading pipeline drawings22:54
wpwrakFallenou: this means that you have to be careful when updating the TLB, but i think you don't need to be more careful than you already have to be22:54
Fallenougn8 !22:55
wpwrakFallenou: e.g., avoid all exceptions and disable interrupts before doing any such thing. or else you may find yourself in trouble :)22:55
Fallenouwpwrak: indeed having the vaddr already set up is good, because you don't have to set it using software when updating the line22:55
Fallenoubut beware not having another miss while handling the previous22:55
Fallenouyou would lose vaddr22:56
Fallenoubut it should not happen22:56
Fallenouirq are disabled, that's ok22:56
Fallenouexceptions are not22:56
Fallenoulm32's exception are not designed to be "turned off"22:56
mwalleFallenou: but mmu is disabled, so you cant get another miss22:56
Fallenouwe just have to try not using misaligned load/store, avoid divide by zero, etc etc :)22:57
Fallenouand we should be ok :p22:57
Fallenougn8 mwalle !22:57
wpwrakyou can also have non-{exception22:57
wpwraklet's try this again22:57
Action: Fallenou is fighting against icache refilling during tlb miss22:57
wpwrakyou can also have non-{exception|interrupt} TLB updates that get interrupted by an exception/interrupt and have VADDR changed by a fault22:58
Fallenouoh, yes22:58
Fallenouthat could really happen22:58
wpwrakwhich wouldn't be much fun either. so you have to 1) disable interrupts and 2) code it such that you can't fault in the middle of the operation22:58
mwallemh not interrupts, because we can say 'you have to turn of interrupts'22:59
mwallebut then still, we could have a fault...22:59
mwallepage miss22:59
Fallenouyes but your code which is playing with tlb can generate a tlb miss22:59
wpwrakjust put it into a bit of asm("...") :)22:59
Fallenouyou have to be sure you don't cross a page boundary22:59
wpwrakload registers first, then work from registers22:59
wpwrakah, ITLB miss. right.23:00
mwallemhmhmh, why does the os need to run with mmus enabled?23:00
wpwrakyes, you need alignment, too :)23:00
Fallenoumwalle: because OS needs virtual addressing23:00
FallenouI mean kernel23:00
Fallenoufor vmalloc etc23:00
mwallejust because linux expected it to be at 0xc0000000 ?!23:00
wpwrakdon't modules vmalloc their code space ?23:00
mwalleFallenou: well it does run without it atm ;)23:01
FallenouI am really not a linux expert but several people told me "linux runs with MMU enabled"23:01
Fallenouyes indeed :p23:01
mwallewpwrak: dunno23:01
FallenouI guess if you don't have vmalloc for kernel you may lose a lot of functionnalities ? (modules ?)23:02
Fallenoudon't know exactly23:02
Fallenoua good question to ask on kernelnewbies :)23:02
wpwrakvmalloc is normally used for data. but in the back of my head, i seem to remember that modules do ugly things.23:02
mwalledo we really need to update the tlb from other places than the miss handler?23:02
wpwrakof course, that may just be to avoid having to make a large contiguous allocation23:02
Fallenouin theory no mwalle but there must be exceptions23:02
wpwrakyes, now it makes sense23:02
Fallenouthat we don't think of23:02
Fallenoumost kernel code that interacts with user space (ioctl, syscalls) are allocating using vmalloc23:03
wpwrakmwalle: invalidation23:03
Fallenoubecause user space does not care if it's physically contigious or not23:03
wpwrakmwalle: writing a valid entry ... maybe not23:03
Fallenouand physically contiguous memory is a rare resource23:03
Fallenouyou need to keep it for hardware/dma23:03
mwallewpwrak: mah,, so simple.. ;) invalidation, right..23:03
wpwrakwe could always just update the page table and let the TLB miss handler do the propagation23:04
Fallenouwell you need to apply changes to the tlb23:04
mwallewpwrak: yeah but as you said invalidation of just one entry wont work23:04
Fallenouelse the mapping will not be invalidated23:04
wpwrakFallenou: errr, vmalloc'ed memory normally doesn't go to user space23:04
Fallenouand between write to VADDR and write to TLBCTRL you can be interrupted by a itlb miss23:04
Fallenouwpwrak: really ?23:05
Fallenouvmalloc'ed+zeroed ?23:05
Action: Fallenou opens his kernel book at page vmalloc()23:05
wpwrakit's for large kernel-internal allocations. not sure if you could actually send vmalloc'ed memory to user space. (i mean, whether there's code that does that. of course, you could hack it.)23:05
mwalleso guys think again ;) there must be a smart solution for this ;)23:06
mwalleim going to bed23:06
wpwrakwe could have a INVALIDATE_VADDR register. that would make the operation atomic ;-)23:06
Fallenouindeed when modules are loaded into memory23:06
Fallenouthey are loaded in memory allocated via vmalloc23:06
mwallewpwrak: please not ;)23:07
mwallesmart solution ;)23:07
Fallenouok let's think a bit more, tomorrow :p23:07
wpwrakmwalle: how about "if you write to TLBCTRL, the VADDR is fetched from r1" ? ;-)23:07
Fallenougn8 mwalle ! it's great to have those brainstormings :)23:07
Fallenouno please no23:08
Fallenouno regular registers23:08
Fallenouit's a pain , really23:08
mwallewpwrak: still too transparent :b23:08
wpwrakhave a two-level FIFO. if you get an exception while it's half-full, adjust the return address such that you return to the previous instruction :)23:09
mwalleto make it easier to fetch data, load the value at r1 ;b23:09
Fallenouwpwrak: ohoh, not bad :)23:10
Fallenoua bit hackish23:10
wpwrakdoes the WCSR instruction have any unused bits ? maybe we could hide the TLBCTRL command there (-:C23:10
Fallenoulet the poor mwalle go to sleep23:10
FallenouI should go as well23:10
wpwrakhe should have plenty of material for some impressive nightmares ;-)23:11
Fallenouwpwrak: only possible unuseds bits are "csr id" but I don't think we can divide their numbers by two23:11
Fallenouor we limit value written to csr to 15 bits :p23:11
mwalleFallenou: the opcode itself has many unused bits23:12
mwallecsr id is only 5 bits wide iirc23:12
wpwrakwell, if all else fails, we can still do just what i've already described above: disable interrupts, make sure you don't get a DTLB miss, and align the code such that it doesn't cross a page boundary.23:12
wpwrakplan B: make a trampolin that executes with the TLB off23:12
Fallenouoh right23:13
Fallenouindeed wcsr only write from register23:13
Fallenouso you have the 16 immediate bits for free23:13
wpwrak16 bits to play with. bwahaha ! :)23:14
Fallenoumaybe we could use them ^^23:14
Fallenouthat would make "two wcsr instructions"23:14
wpwrak48 bit CSR registers :)23:14
Fallenouwe could put tlbctrl commands in those 16 bits23:15
Fallenouand then suppress tlbctrl23:15
Fallenouonly write to vaddr/paddr with the command hidden in the 16 lower bits23:15
Fallenoutlbwcsr tlbvaddr, my_value << 16 | my_command23:16
Fallenouwell no23:16
Fallenoubut you got the idea23:16
Fallenouthat would be a mess :( plenty of wcsr*** commands23:16
wpwrakasm("") is your friend :)23:17
Fallenouor that would mean adding a tlbwcsr statement in gnu-as, which takes two args23:17
Fallenou1 register and 1 immediate23:18
wpwrakwe could also keep the TLBCTRL code in the lower bits of VADDR23:18
Fallenouoh, right23:18
wpwrakwrite to PADDR -> update entry. write VADDR -> perform operation. (and make one a no-op)23:18
Fallenouyes, good idea23:19
wpwrak(tlbwcsr) three args. you may have more than one destination CSR :)23:19
Fallenou01:26 < wpwrak> write to PADDR -> update entry. write VADDR -> perform operation. (and make one a no-op) <= I guess you solved the problem ?23:19
Fallenouthis seems enough to take care of everything23:20
wpwrakalready ? that would be disappointingly easy :)23:21
Fallenouwell that makes invalidating a line a single instruction23:21
wpwrakbut yes, in case we find we need anything more, we can always introduce extra CSRs23:22
Fallenouonly writting to VADDR23:22
wpwraki think atomic operations are the best approach. no need to worry about a lot of things.23:22
Fallenoutlb flush would be performed using vaddr ? or still tlbctrl ?23:22
Fallenouwould be kind of ugly to do this through vaddr I think23:23
Fallenousince it involves ALL the tlb, and not a vaddr in particular23:23
wpwrakonly perhaps what happens if your're executing a TLB change and a ITLB miss occurs when fetching one of the following instructions. may be a bit delicate not to mess this up :)23:23
Fallenouand tlbctrl seems useless now that you can feed commands in vaddr lower bits :(23:23
wpwrakyeah, we killed TLBCTRL :)23:24
FallenouI think too23:24
wpwrakVADDR could take of all the operations. we have a lot of bits to play with :)23:24
Fallenouenough killing for tonight23:25
Fallenousee you tomorrow :)23:25
Fallenougn8 !23:25
wpwraksweet dreams ! :)23:26
Fallenouthanks, you too !23:27
wpwrakhmm, in a few hours :)23:28
--- Tue Jul 24 201200:00

Generated by irclog2html.py 2.9.2 by Marius Gedminas - find it at mg.pov.lt!