#milkymist IRC log for Wednesday, 2011-04-27

kristianpaulFallenou: so in rtems i just need modify a pointer value to freely point memory isnt? or there are limitations about what i can point considering that the FS is also ram..01:46
kristianpaulit may sound stupid, but i need confirm01:46
Fallenoukristianpaul: look at how registers are read from and written to06:07
Fallenouyou van use directly the address06:07
Fallenoubeware of cache pb, volatile etc06:07
terpstralekernel, were you imagining a separate TLB for instruction and data buses? (seems best given the dual-ported design of the LM32)08:12
lekernelaw: so it seems the new protection system works great08:52
awyes, SCH D.08:53
awi am using the official adapter to record data again though...08:53
awthe holdinf current seems that actually higher than any I did before. yup off course it must be. due to 2A fuse.08:54
awmeanwile I am watching the temperature especially with our current adapter to see the surroundings temperature around DC jack.08:55
awthis is most now I am checking though. :-)08:56
awyou can see also the whole marked '2.85A' is my limited from lab. power supply...and it's output/capability won't drop too much when loading. I think that I need to have a 'burning run' for at least 1 week or do a ageing on adapter.08:59
awlekernel, i really forgot that 1A must be available for two host usb, thanks that  last email from you to remind me.09:02
lekernelaw: why do you have 40mA going through the diodes at 5V?09:12
awlekernel, where?09:13
lekerneltable 1 non-reversed 5 4.994 4.992 0.04 / -09:13
awumm..it's no loads condition.(without MM1)09:14
lekernelthere's another zener in the same series with a 5.6V voltage, maybe it's better to take that one09:14
lekernelwith yours the minimum specified zener current is 4.85V, that might explain that 40mA current09:15
lekernels/zener current/zener voltage09:15
awwhen initially power up on NO LOAD circuit, the fuse is cold as before it goes into 'holding' stage.09:16
lekernelyou didn't get my point. the thing is that with a 5V voltage, your circuit should consume ZERO power. but instead of that you have 40mA through the diodes.09:16
awfuse can still have current flow even when it stays over a 'holding' value than current slowly rise to its 'cut/trip' current.09:17
lekerneldo we really want to have this protection circuit continually consume power and get hot?09:18
awhmm..i know you feel strange, later I'll measure again. :-) 5V is less than 5.1V. so you think there should no current. :-)09:18
lekernelwell your measurement is probably correct09:18
lekernelthe diode datasheet specifies that the zener voltage can be as low as 4.85V09:19
awi actually haven't not decided to if use this circuit now.09:19
lekerneland we do not want that, so I'm suggesting that we take the 5.6V diode instead with a minimum voltage of 5.32V09:19
awwhen I saw/discovered those temperatures.09:19
awbtw, now no matter if picking a 5.6V diode, I can imagine that temperature is stilll existed there though. this is worse than rc2.09:20
lekernelit will still get hot **when the user exceeds the specified voltage**09:21
awwell the true thing is this h/w batch is better than rc2 to have protection function.09:21
awbut yes.09:21
lekernelnot when they use the recommended adapter09:21
lekernelwith your zener, it would get hot with the recommended adapter09:21
lekernelgetting hot when the user does something stupid isn't a problem09:22
awso i do really have not decided this though. I even think that I personally don't like this batch now.09:22
lekernelI do. 2A fuse, 5.6V zener, done.09:22
awso I am trying to get how warm our adapter will up?09:23
lekernelwith the 5.6V zener there should be ZERO current and ZERO heating09:23
awwell...good idea on 5.6V though.09:24
awusb spec needs 4.75~5.25V too. so 5.6V diode is over. that's why i picked 5.1V.09:25
lekernelyeah I know09:25
awbut I can try though to see how low it will be. :-)09:25
lekernelbut 5.6V for a short period shouldn't do much damage, and is definitely better than having 20V or so if the user is stupud09:26
lekerneland there is still good protection for reversed polarity or AC adapters09:26
awbut the true conditions on user we don't want them to use 20V adapter.09:26
lekernelI know. but the whole point of this protection is to provide some security against human stupidity09:27
lekerneland I insist on "some", as stupidity is infinite there can be no fully adequate protection09:27
awso like we declare that board "suggested input range: 4.75 V ~ ?V"..09:28
lekernelno, we declare it as a *mandatory* input range09:28
lekernelno change compared to rc209:28
lekernelbut there is an additional safety belt if users do not listen to that09:28
lekernelso it ends up doing less damage09:28
awwell..wait I am supposedly .not provide a real conditional. :-)09:29
lekernel5.6V is 1N5339BG09:30
awyup...that's why i said that I haven't no idea/ or decided that if we need this h/w patches...too unknown conditions could be happended.09:30
lekerneljust take that, do some quick testing and go ahead09:30
awha...you really want to try that 5.6V even it's over 5.25V for usb?09:31
lekernelthe wanted result is that the board should a) have no regression b) incur less or no damage when fed inappropriate voltages09:31
awsurely i can quickly go for it.09:31
lekernelyes, definitely09:31
lekernelas I said09:31
lekernel5.6V on USB wouldn't damage much in most cases, and is still a lot better than whatever overvoltage an inappropriate adapter would give09:32
lekernelimagine this situation: user plugs a 20V adapter to the M109:32
awokay..imaginable though...09:32
lekernelwith USB devices on it09:32
lekernelwithout the protection you get 20V on port, and this will probably break the USB devices09:33
awsorry that i am going to outside now...09:33
awtalk yo later.09:33
lekernelwith the protection you get 5.6V or so for a dozen seconds, and this will probably NOT break the USB devices09:33
awlater back to see this. cu09:33
lekernelso. 2A fuse, 5.6V zener.09:33
lekernelperiod :-p09:34
awtime to go..cu09:34
terpstralekernel, a thought: couldn't a LM32 TLB just work like CSRs work right now?09:45
terpstrain 'kernel mode' there is no address translation09:45
terpstrain 'user mode' you use a CAM lookup of these TLB registers for the appropriate page09:46
terpstraand if there is a miss, a segfault exception is raised09:46
terpstraand the OS has to fill in the missing page into some CSRs09:46
terpstrawe already have to save/restore 32 registers on context switch, so saving some 16-32 extra TLB entries doesn't seem like more more overhead09:46
terpstrai guess the CSR namespace has been filled up too much with other CSRs, but a single new instruction 'WTLB' that behaves almost like the 'WCSR' should be enough to get the job done09:48
wpwrakterpstra: the kernel also needs to be able to copy to/from user space. better if it can use the TLB for this, instead of having figure out these things "manually". could be a one-shot switch, though. e.g., set a bit that makes the next access use the TLB, then switch back.10:04
wpwrakterpstra: another thing for the kernel: for vmalloc, you also want the MMU in kernel space10:04
terpstrawpwrak, why does it need the tlb to copy to user space? it knows which page is at which address for the user-space, so it can just copy to the appropriate page's physical address10:05
wpwrakterpstra: yes, it's possible but messy10:06
terpstraif i recall correctly, the linux kernel already has a function you are supposed to call when accessing user-space memory via a pointer as provided from user-spave10:06
terpstraie: if you get a pointer from user-land via an ioctl, you are supposed to convert it for use inside kernel space10:07
wpwrakterpstra: yup. you have these functions. as i said, you can do all this without mmu support, but it's a lot of overhead10:07
terpstranot so much overhead as compared to reloading the TLB i'd wager... ?10:08
wpwrakterpstra: for example, if you copy a string byte by byte, you need to do a page table lookup and permission check for each access. messy.10:08
terpstrawhat? we would you do that?10:08
terpstrado it one lookup for the block transfer10:08
wpwrakterpstra: for larger accesses, you also have to check if you're crossing a page boundary10:08
terpstracrossing page boundary, sure10:09
terpstrabut doing a single table lookup per page copied sounds like negligible overhead to me10:09
wpwrakterpstra: yes, if this is implemented as a block transfer. this isn't always the case.10:09
wpwrak(reloading the tlbs) why not have two ? one for user space and one for kernel space10:10
terpstraarea cost10:10
wpwrakis the cost prohibitively high ?10:10
terpstrawell TLB will need to be a fairly high associative cache10:10
terpstraand we'll need one for each bus already10:10
terpstramaking kernel-mode need it too doubles the cost10:11
wpwrak(each bus) you mean instruction and data ?10:11
wpwrakyou probably don't need an I-TLB for the kernel. so the extra cost is only +50% ;)10:12
terpstrafor an FPGA we probably can't make it a fully associative cache like in a real CPU... as we don't want to use tons of registers, so we will need a 2- or 4-way associative TLB in order to use FPGA ram blocks10:13
terpstraTLB is going to be really expensive in area i think10:13
terpstragoing to be slow too. :-/10:14
wpwrakwell, you could make a really simple TLB (e.g., one entry) and collect statistics :)10:14
terpstrayou need in sequence: RAM block indexing (based on low page id bits), then comparison of TLB tag to high page bits, a MUX to pick the correct entry in the associative cache, then comparison of TLB result to L1 cache tag for the physical tagging check, finally the signal has to trigger an exception10:16
terpstrathat's some deep signalling...10:16
terpstraall this happens between two clock edges10:17
wpwrakyeah. well, you have to do this anyway, whether you have a kernel tlb or not.10:18
terpstrabut kernel TLB just makes it even bigger ;)10:18
wpwrakah, and you don't need the kernel tlb for kernel/user space access. you'd just reuse the user space tlb. what you need is a way to switch it on while in kernel mode.10:19
terpstramaybe just one TLB10:19
terpstraand have kernel mode bit enable access to a 'restricted' memory range10:20
terpstrathen you can happily re-use user-space pointers when copying to/from your kernel-land memory in the restricted range10:20
wpwraknot sure how badly you need vmalloc int the kernel. it's kinda frowned upon, not enough that people wouldn't use it ...10:20
terpstrathe restricted range doesn't go through TLB10:20
terpstrathink 1GB is enough memory for userland? ;)10:21
wpwrakthat would be more or less equivalent to a 2GB/2GB split. yes, a possibility10:21
terpstraor maybe: 2GB user-land, 1GB kernel land, 1GB memory mapped IO non-cached region10:22
terpstrauser mode cannot access addresses with high bit set10:22
wpwrakyou're very generous with that address space :)10:22
terpstraaddresses with high bit set do not go through TLB10:22
wpwrakwell, for a first version that'll do. can always be improved later.10:22
terpstraunfortunately, my idea of a WTLB instruction won't work10:25
terpstrasince a TLB entry will need to be 40 bits wide10:25
terpstrawell, i guess it could be made to work if we have 256 TLB entries. *cackle*10:26
terpstra<1 bit user/kernel> <19 bits virtual page number> <12 bits page offset>10:27
wpwrakwhy 40 bits ?10:27
terpstrathe 19 bits virtual page number = <13 bits TLB tag> <6 bits TLB index>10:28
terpstrathen your TLB entries have: <13 bits TLB tag> <19 bits physical address>10:29
terpstraand it fits!10:29
terpstraand only 32 TLB entries needed10:29
terpstra(i was imagining a full 20-bits for virtual address and physical address)10:29
terpstrathis way you can pack it better, though10:30
wpwrakah, regarding the split. it's not so nice, because you'd then have to check that user pointers are in the correct address range, along with overflow issues. probably still better to have a means to just switch the user mode for the next access.10:30
wpwrakyou also need permission bits: read, write, and execute would be desirable, too10:32
terpstrawe have two TLBs one for data and one for instruction10:32
terpstraso execute means it is in the instruction TLB10:32
terpstrai suppose read/write needs a bit, though for the data bus10:32
wpwrakvery good. so just one for write.10:32
terpstradamn you10:33
wpwrakhehe :)10:33
terpstrathere be not enough bits ;)10:33
terpstrashould it be possible for a user to map device memory ?10:34
terpstrai suppose this is useful especially for a micro kernel10:34
wpwrakhmm yes. that would be very nice to have.10:34
terpstraso you need a full 20 bit physical address in the TLB10:35
wpwrakalso for plain user space. think the old architecture of the X server.10:35
wpwrakor all my current atrocities surrounding UBB on the ben ;-)10:35
terpstraso 20 bits for physical address, 1 bit for read/write flag.....10:35
terpstrathat means only 11 bits for the tag10:35
terpstrai guess if you had 8 bits of TLB index (256 entries... eek)10:36
terpstrathat's too bgi10:37
terpstraor give up on fitting the TLB entry in 32 bits10:37
terpstraor go for a bigger page size ;)10:39
wpwrakkeep things easy - use 1 GB pages :)10:40
terpstra8k page size would mean <19 bits physical address> and thus <12 bits virtual address tag> and only <6 bits for the TLB index>10:40
terpstraso back to 32 TLB entries10:41
terpstrathat is nice10:41
wpwrakplus, that way you'll find all the programs that assume that a page is 4 kB :)10:41
terpstrathey've been fixed already i think10:42
terpstradebian must run on stuff with 8k pages by now10:42
wpwrakrun or stumble :) well, you can try 8 k and if it sucks too much, go to 4 k10:43
lekernelcan't we just disable address translation in kernel mode?11:04
lekernelthis way we're also backward compatible with programs like RTEMS stuff that do not use the MMU11:04
lekernelthey just run in kernel mode all the time11:04
terpstralekernel, that's what i wanted to do too11:08
terpstrabut wpwrak says its a problem11:08
terpstraso what do you think about just grabbing the entire TLB on context switch like we have to handle registers anyway?11:09
terpstrait doesn't/shouldn't be so big as the L1 caches anyway11:10
lekerneldepends... how big is the TLB?11:10
lekerneland how do we ensure compatibility with programs that do not use the MMU?11:10
terpstrawell, i also liked the idea that kernel mode = no MMU... then you have your compatability11:11
lekernelI don't think there's a problem, Norman pointed out on the list that Microblaze does that11:11
terpstrai've been reading around, and it seems that the TLB for mips isn't so big11:11
terpstraeven the AMD64 only has 1024 entries11:11
terpstraso 32 should be fine i guess11:11
terpstraprobably 16 is already plenty11:11
terpstraR2000 had 64 entries11:12
terpstraR4000 had 32 to 6411:13
terpstra(so later versions had less entries, which seems suggestive to me)11:13
lekernel"TLB is organized as 3-way set associative."11:14
terpstrayeah, we definitely will need associativity11:14
lekernelif we have only 32 entries, it can be fully associative, no?11:14
terpstrai suppose we could try without at first tho11:14
terpstraproblem with fully associative is it rules out using RAM cells11:15
terpstrayou need full registers then11:15
terpstrawhich is a lot11:15
terpstraon my cyclon3 the LM32 needed only like 1k registers for the full design i think11:16
lekernelwe can also have no associativity and a lot of TLB entries to compensate11:16
lekernelso we take advantage of the BRAM11:16
terpstrai think for a first version this makes the most sense11:16
lekernelbut reloading the TLB would take time during context switches then...11:16
terpstrahowever, i don't buy totally into the 2- and 4- way associative is like 2* and 4* bigger cache11:17
lekernelthough probably not a lot more than those architectures which flush the L1 caches on each context switches11:17
terpstrathere are many byzantine scenarios that can happen in practise where associativity is >>> more slots11:17
lekernelyeah sure11:17
lekernelas a general rule x-way associative has better performance than x times the size11:18
terpstrabut for a first version, i think non-associative makes sense11:18
lekernelnon portable though11:20
terpstrathat's nice for you xilinx users11:20
lekernelyeah... and xilinx patented the srl16 too11:21
terpstraso basically one LUT can decode 4-bit index ?11:21
terpstrathat's possible on altera too11:21
terpstraproblem is that you can't reprogram the LUT at run time ;)11:21
terpstrai guess this is the value added part of the xilinx approach?11:22
terpstraahh, yes, i see it now11:22
terpstraSRL16E diagram11:22
terpstrato mimic a SRL16E portably i would need 4 registers, and 3 LUTs i think11:23
terpstrawpwrak, do you realllllly need the mmu in kernel mode?11:25
wpwrakterpstra: maybe the best approach is to implement a trivially simple TLB, run a test load (e.g., kernel compilation, emacs, whatever) and keep statistics of what happens. then pick a design accordingly.11:25
terpstrawe also need a way to determine the address that triggered a TLB miss11:26
wpwrakterpstra: (mmu in kernel mode) well, for vmalloc ...11:26
terpstrawpwrak, why does vmalloc need an mmu?11:26
terpstracan't it just allocate from the physical address space?11:26
lekernelterpstra: well I think that having a large non-associative TLB in a block RAM is good for starters11:27
wpwrakterpstra: because it can give you virtually contiguous allocations even if your pages are all physically fragmented11:27
terpstraCode that uses vmalloc is likely to get a chilly reception if submitted for inclusion in the kernel. If possible, you should work directly with individual pages rather than trying to smooth things over with vmalloc.11:28
lekernels6 FPGAs have RAM blocks of up to 16 kilobits each... a few or even just one of them can hold a sizable amount of TLB entries11:28
terpstrai don't think we need/want more than 32 TLB entries11:28
terpstraby keeping the TLB small we can more easily just load/store it from the kernel instead of trying to preserve it like the L1 cache11:28
wpwrakterpstra: (chilly reception) for sure. yet it exists, so .. :)11:29
lekernelterpstra: you mean for encoding the WTLB instruction?11:29
wpwrakterpstra: anyway, you can make the kernel tlb fairly inefficient.11:29
lekernelI don't see what the problem is with a large TLB, except more context switch overhead11:29
terpstrai don't want context switch overhead11:29
terpstraeither we need to leave stale TLB entries that get flushed on demand (more work for the hardware)11:30
wpwrakterpstra: ah, and i think modules may use the mmu too. so, i-tlb for the kernel as well. life sucks, doesn't it ? :)11:30
terpstraor we need to save/restore more TLB entries on context switch11:30
terpstramodules get loaded at different addresses11:31
terpstrai don't think there's MMU action there11:31
terpstrathat's why it's a pain to find the symbol of a module from a kernel register dump11:31
lekernelotoh a larger TLB means less TLB misses11:31
lekernelI don't think it'd be hard to make the TLB size configurable with this approach11:31
lekernelso we can just try and see :-)11:31
terpstrait impacts the layout of the TLB tho11:31
terpstraif you want to pack the TLB entries into 32 bits ;)11:32
terpstrain a perfect world you could have 32 TLB entries, each 32 bit wide11:32
terpstrathen it would have a 'normal' LM32 register encoding11:32
terpstraie: a simple WTBL instruction would work just like WCSR does now11:32
juliusbjust give up on this LM32 stuff, use OpenRISC ;)11:34
juliusbWe've already got this MMU stuff going11:34
juliusbour kernel port is solid, too11:34
wpwrakterpstra: (i-tlb) you're right. doesn't actually run code from the vmalloc'ed region11:34
juliusbone interesting experiment I want to do very soon is actually calculate overhead for TLB misses and reloading11:34
juliusband the effect TLB sizing and associativity has on that11:34
terpstrajuliusb, how does the openrisc do tlb ?11:34
juliusbgood question. the architecture is fairly flexible - allows various sizes and up to 4-way associativity11:35
juliusbi'm not across the details of it specifically off the top of my head11:35
terpstraphysically tagged and indexed?\11:35
lekernelyeah, let's use openrisc. then the flickernoise framerate would drop to something like 0.2 fps while the FPGA LUT count increases :-)11:37
juliusbno, I think virtually tagged11:37
juliusbhangon no11:37
juliusblekernel: prove it :)11:37
juliusbno I agree, or1200 aint so tiny11:38
terpstrajuliusb, to be honest i haven't fairly evaluated the openrisc11:38
terpstrait is just so big11:38
juliusbbut, i'm serious about using it if you're considering doing a Linux port11:38
terpstrabut adding an mmu to the lm32 will make it big too11:38
juliusbit's been like 2 years of work for us to just get the kernel port and toolchain to a point where it's usuable now11:38
lekernelterpstra: I don't think that a simple TLB in a block RAM would make it very big11:39
juliusbwe have some good kernel developers now, and the HW seems quite stable across various technologies11:39
lekernelmy guess is something like 2 BRAM + 200 LUTs, not more11:39
terpstralekernel, the OR is only 6* bigger than the lm32 :)11:39
juliusblekernel: but as described before, you need a lot more than just a block ram, you need a tag ram and then all the appropriate error detection and exception handling logic11:39
juliusbfor each port11:39
juliusb... it would be an interesting experiment though11:40
terpstrayes, juliusb is right that it will cost us11:40
lekernelsure, that's what those 200 LUTs are for11:40
juliusb.. hey by the way, why do you want to run Linux in the first place??11:40
terpstracause i want debisn!11:40
juliusbit's not a good idea for embedded stuff I argue - you have this MMU mess, and it only gets worse if you want shard library code11:40
juliusbyou need all that indirect function calling garbage11:41
terpstra(for gsi/cern we don't want linux tbh)11:41
juliusbit helps extensibility at the software level, but that's it right?11:41
terpstrai am just interested from a hypothetical point of view11:41
juliusbi think you sacrifice a lot of performance just to have the basic benefits of a GNU/Linux, namely the plethora of software out there11:41
terpstrai agree with you11:42
lekernelsame here. i'm globally satisfied with RTEMS.11:42
wpwraki think 2-way could be useful to avoid thrashing block copies. a dirty approach would be to have only one entry 2-way. basically if you evict a tlb entry, you move it to the 2nd way.11:42
wpwrak(that's for data)11:42
juliusbsoftware based on RTOS, however, is far more complicated to write and maintain than stuff that's POSIX compliant for Linux11:42
terpstrawpwrak, that's what a victim cache has been for traditionally ;)11:42
lekernelnot that much11:43
wpwraknot sure what code would be most happy with11:43
juliusb...i mean more complicated to write and then port to a new design or architecture etc.11:43
lekernelas a matter of fact, a lot of 3rd party POSIX stuff runs almost flawlessly on RTEMS11:43
lekernelI have freetype, libpng, libjpeg, libgd, mupdf, ...11:43
juliusbya, I saw RTEMS is POSIX friendly11:43
terpstrathe main advantage of an mmu: fork()11:43
juliusbthat is very good11:43
terpstrai think most of the rest can be dealt with11:43
wpwrakterpstra: aah, already invented. darn.11:44
terpstrawpwrak, i didn't mean to invent it---i meant that's the functionality you gain from an mmu11:44
terpstrayou can't really do fork() without an mmu11:44
juliusbbut who is going to do the port of the kernel to LM32??11:45
juliusbor does it exist already?11:45
terpstrathere is a uclinux port afaik ?11:45
juliusboh good, 2.4 kernels are fun11:45
juliusbthere's no such thing as far as I'm aware, it got merged with the mainline a long time ago, no?11:46
terpstrai've not used it11:46
terpstrai just know lattice claims this11:46
wpwrakterpstra: (invented) i meant the victim cache11:46
terpstrawpwrak, ack11:46
lekernelterpstra: there is a super crappy uclinux port by lattice, which larsc, mwalle, Takeshi and I have improved11:47
lekernelit's still not merged upstream though11:47
terpstrait's 2.6 or 2.4?11:47
juliusbi've just looked, they've got a 2.6 version now11:47
lekernel2.6... in fact we follow upstream11:47
juliusbbut there's MMU-less kernel now, right? and uClibc11:47
terpstraso if an mmu were added, not so hard to get 'proper' linux on it i guess?11:48
juliusbwhat's the difference, then, between uClibc and real kernel?11:48
juliusberr, uClinux and real kernel11:48
juliusbthey strip a lot of crap out of it?11:48
lekernelI don't know. I have little knowledge about linux memory management internals11:48
terpstrauclibc has nothing to do with mmu or not11:48
terpstrauclibc is just a smaller version of libc11:48
terpstrauclibc is under 200k compared to > 3MB for glibc11:49
terpstrayou usually see uclibc + busybox on embedded devices like routers/etc11:49
terpstrawhere you have 8-32MB of RAM11:49
terpstrathose systems also have an MMU11:49
juliusbi'm sure there's some NO_MMU stuff in uClibc11:49
terpstrasure, to remove fork() ;)11:49
Action: wpwrak crawls to bed and hopes for happy dreams of an mmu :)11:49
terpstrayou won't be getting fork() without an MMU11:50
terpstraand that's why even embedded devices with linux have one11:50
terpstrathose cheapo little routers, kindles, android phones, etc --- they all have an MMU even when they have almost no memory11:50
terpstra(tho the kindle actually has half a GB of ram)11:51
juliusbsure, it's ASIC and probably the extra silicon required to put in a n MMU and reduced amount of softwareexecuted to do virtual memory management is worth it11:52
juliusbIf you're really, really, stretched for area, maybe MMU-less makes sense11:52
terpstrawe should really see how much area a completely primitive mmu takes11:52
terpstraif lekernel is right that it's 200 LUTs or less, then might as well have it on an FPGA too11:52
lekernelin milkymist we're only using 44% of the fpga area, so a mmu would get merged provided it does not slow things down or introduce other regressions11:53
juliusbI think you'll want all the performance you can get on FPGA running Linux and it would make a lot of sense to have one11:53
juliusbWe're so concerned about performance on Or1K linux that we're looknig at doing hardware page table lookups instead of handling misses11:53
juliusb... in software11:53
juliusbit's really, really, slow11:54
terpstralm32 is very fast11:54
terpstrai bet i could write a TLB replacement algorithm that ran in under 50 cycles11:54
terpstrapossibly even under 3011:54
juliusbno, i'm not talking /MHz here, I'm talking overall performane because Linux is just a state-swapping machine11:54
juliusbalways loading and storing and accessing various process states11:55
terpstrai see11:55
juliusbterpstra: sure, but what about saving and configuring your state to get into the plcae were you can then do your TLB algorithm in 30 cycles??11:55
terpstrathat's a good reason to make kernel-land not mmu mapped ?11:55
juliusbI think it's a good reason to avoid Linux :)11:56
terpstrajuliusb, i was including the save/restore in those 30 estimate11:56
terpstraif we added an mmu to the lm32 it would launch an exception handler where you do a quick LRU/heap operation and then an eret11:56
juliusbterpstra: It's not so much but with a pissweak TLB you're doing it all the time (seriously, every new function call) and it adds up11:56
lekerneljuliusb: how many function calls are new?11:57
lekernelyou're talking about lazy linking, right?11:57
juliusbI'm not sure exactly how it works but I'm pretty sure it occurs quite frequently11:58
juliusbwell, anything outside of the page11:58
juliusbwell, instruction and data, too, mind you11:58
terpstrawouldn't the page with the 'got' stay in the TLB most of the time?11:58
lekernelwell, the TLB miss on each new function call just hits at application startup11:58
juliusbhopefully the data TLB miss doesn't occurr so often11:58
lekernelthe code gets patched after that and no longer misses the TLB11:58
terpstralekernel, the code doesn't get patched -- the 'got' gets filled11:58
juliusbI'm talking about statically linked programs here, I don't know about dynamicaly linked stuff11:59
terpstrayour function calls to global symbols go via the data bus11:59
juliusbwe don't have dynamic linking yet in our toolchain, but we're working on it, and it looks like extra headache for userspace execution11:59
terpstrayes, indirection is expensive11:59
terpstrai'm somewhat skeptical that the TLB miss rate is so high12:00
juliusbbut, I'm contributing to this discussion because I'm going to be starting some work shortly on really gaugeing the overhead of TLBs12:00
terpstrawhy would the mips folks move from 64 TLB entries to 48 if it is such a problem?12:00
lekernel_terpstra: yeah, you're probably right. but in either case, I don't think that lazy linking significantly increases any TLB miss rate.12:01
juliusband our feeling is, after playing with our port, is that TLB misses occur often, and a good way to increase time spent doing useful things, rather than management overhead, is minimising this12:01
terpstrajuliusb, fair enough.12:01
terpstrayour current tlb is how big?12:01
juliusbwe can have up to 12812:01
terpstraand you still have lots of misses, eh? that's somewhat worrying. 2-way associative?12:02
juliusbbut is single way12:02
terpstrathen i believe you12:02
juliusbyes, I want to add ways12:02
terpstramost TLB in 'real hardware' is CAN12:02
terpstraso fully associative12:02
juliusbah ok12:02
terpstrasorry, CAM12:03
terpstrai typo'd12:03
lekernel_2-way associative looks doable... lm32 does it for the caches12:03
juliusb... or come and pimp out the OR1200's TLBs to do multi-way ;)12:06
terpstragive me the or1k vs. lm32 sales pitch :)12:06
juliusbwell, I'm not the expert but I know the licensing on LM32 isn't pure BSD (has some taint from LM), whereas or1200 is all LGPL12:09
juliusbi don't know LM32 architecture so well, but I think OR1K has pretty solid architecture, missing a few key things like atomic synchronisation instructions12:09
juliusbbut those can be added12:09
juliusbOR1200 as an implementation is bad I think12:09
juliusbI've been hacking on it for a few years and hopefully had made it better, but certainly it hasn't become leaner and more efficient12:10
lekernel_which diminishes your point about LGPL12:10
juliusbour toolchain is good now12:10
terpstraso your position is that the or1k + toolchain + kernel support is good, but the or1200 implementation is the bad part/12:10
juliusbour toolchain was a joke, but now it's good12:10
juliusbyes, but it at least as MMUs already in there to save you working on that, but I think having a full on kernel port (we're giong to start pushing for acceptance in GCC and Linux sometime this year) is a pretty big deal12:11
juliusbit's a lot of work to add all the bells and whistles12:11
juliusbor1200 isn't bad, it's just not awesome12:12
juliusb... i may know of a rewrite in progress12:12
juliusb... but that's a little ways off yet12:12
lekernel_gcc/linux kernel: true. but as far as I'm concerned it is not my priority12:12
terpstrabinutils+gcc for lm32 is already in mainline12:12
juliusblekernel_: I understand you need as much performance as possible, but again I ask why even consider Linux when you need to be productive on almost every cycle, the pitch kind of isn't for that12:12
terpstraso here the lm32 is further than the or1k12:12
juliusbit's for anyone considering Linux12:12
lekernel_neither is the MMU, and I cannot accept the regressions that OR1K would introduce just to get some work already done on the MMU12:13
juliusbok, sure, but we will be sometime this week12:13
lekernel_terpstra: otoh the mainline lm32 gcc is often broken... it was somewhat acceptable in gcc 4.5 and was badly broken in 4.612:13
terpstrais there a good document for the or1k comparable to the lm32's archman pdf?12:13
juliusbi'm saying as an open source CPU that has a working full on kernel port, I would consider or120012:13
lekernel_maintaining gcc is a pain in the ass12:14
juliusbyep, but we have guys doing that12:14
juliusbterpstra: yes, we have recently re-worked the architecture spec12:14
juliusbcleaned it up, etc12:14
terpstracould you toss me a link?12:14
terpstrai'd like to read it12:14
juliusbhttp://opencores.org/download,or1k - click on the openrisc_arch_submit4.odt link12:15
juliusbit's not in SVN yet I think12:15
juliusbwe've still got it out for review12:15
juliusbbut... it's on logincores.org (opencores.org I mean)12:15
juliusbgotta register12:15
lekernel_juliusb: I'm not considering linux, except for demos and just the fun of it12:15
lekernel_juliusb: when are you going to change that policy?12:16
terpstrai have an opencores account, not a problem,12:16
juliusbi just had lunch with the guy in charge here, he's not convinced12:16
juliusbI tried12:16
juliusbhe argues that what's the big deal - you're getting access to stuff for free, give us some information so we can provide to advertisers who comes here so we can fund the webserver12:16
rohjuliusb: never discuss too much with stupid people. work around them.12:17
juliusbwell, there's already a fork happening: openrisc.net12:17
juliusbthey got fedup with opencores12:17
lekernel_another irritating thing in opencores policy is the requirement that files be uploaded on your server. which in turns mandates the use of SVN and your web interface, both being a lot inferior than e.g. git and github12:17
juliusbsure, I think they're fighting a losing battle12:17
juliusbohh nice, ohwr.org12:17
terpstra(that's where my stuff lives)12:18
juliusbcool, thanks12:18
juliusbanyway, this is an ongoing thing with OpenCores - they still  don't see, even after talking a lot with them, why they can't take a little if they give a little12:18
lekernel_and btw I can't see why running such a webserver would be so expensive12:19
juliusbI'm at least trying to get them to dump the forums and bugtracker (both some custom hack they got this young guy to do) and use a mailinglist and bugzilla12:19
juliusbya, well, it shouldn't be, but it is if you go about the wrong way for 3 years12:19
juliusbI think their heart is in the right place - they didn't want OpenCores to die and thought they could make it great12:19
juliusbbut I think they're not so open-sourcey12:19
juliusbi probably shouldn't be saying this :P12:19
juliusbit's in flux, I hope, and things will change eventually12:20
terpstrameh - until someone writes an opensource hdl toolchain, we don't reallllly have 'opencores' anyway12:20
lekernel_well, you're among friends. I'd even dare say you've just joined the #opencores-haters channel *g*12:20
juliusbi know well with the guy who started openrisc.net and it'll be interesting to see the response they have12:20
juliusbhehe sure, and I'm working hard on OpenRISC and just like to see others getting into the oshw stuff, too12:21
lekernel_terpstra: this is under way :p12:21
juliusbi come in peace, but I'm employed by ORSoC and feel I should at least try to provide them with good advice on OpenCores12:21
terpstrajuliusb, i don't hate opencores. i hate the blinky flash adds. ;)12:22
juliusbbut, anyway, just wnated to point out if you really want Linux on an open source CPU, try Or1K12:22
juliusbI think there's some tuning to be done, like anything, but it's probably a good place to start12:22
terpstrajuliusb, i will read the arch manual and then form a more informed opinion :)12:22
juliusbi expect nothing less :)12:23
juliusbbut I, too, am very interested in the fully open source toolchain for HDL synthesis and backend12:23
juliusbhence popping in here the other day to ask lekernel_ about his work so far12:24
lekernelhe, it's coming :)12:25
terpstrajuliusb, or1k has a branch delay slot?12:25
lekernelwanna help?12:25
terpstrawasn't this proven to be a bad idea by mips?12:25
terpstralearn from the past! ;)12:25
lekernelwhy is it a bad idea?12:25
juliusbarchitecture is initially from 199912:25
lekernelfwiw microblaze has it, and from studies I've read it does provide a performance advantage12:25
terpstra"The most serious drawback to delayed branches is the additional control complexity they entail. If the delay slot instruction takes an exception, the processor has to be restarted on the branch, rather than that next instruction. Exceptions now have essentially two addresses, the exception address and the restart address, and generating and distinguishing between the two correctly in all cases has been a source of bugs for later designs."12:25
juliusbi'm dealing with this now, actually12:26
lekernelwhat is in fact a bad idea is have several delay slots12:26
lekerneljust one is still reasonable12:26
terpstrahttp://en.wikipedia.org/wiki/Classic_RISC_pipeline -- scroll down to the area where they list the reasons12:26
terpstrathat reason is just the most pertinent i think12:26
juliusbwell, I think the control overhead of having one compared to none is far more than from having one compared to two12:26
juliusbit's a hassle for out of order etc12:27
lekernelwell a lot of features make a mess from exceptions. out of order execution being most infamous for that.12:27
lekernelbut if you want a simple design, then yeah it's probably better not to have the delay slot12:27
juliusbthat sounds about right, but it just adds a little bit of extra complexity where you don't want anything extra12:27
lekernelit does increase performance, so it's a trade of12:28
terpstralekernel, it increases performance only if the compiler can find a good instruction to put there12:28
terpstrawhich at the end of a basic block usually means putting a 'write to memory'12:28
juliusbbut pipelines taht run really fast now are very long12:28
lekernelyes. but from the paper I've read it still works12:28
terpstrabut those are precisely the instructions which generate faults12:28
juliusbpart of the idea was to offload complexity into the compiler from the HW, as the HW development wasn't so advanced right?12:29
juliusbbut now it just makes things more complicated at the HW level12:29
terpstrai am a firm believer in simpler cores, but many corse12:30
juliusband compilers are actually fairly clever now, so I guess that's not an issue, but why cause the HW to be more complex when really there's marginal benefit12:30
terpstrawe've carried the hardware supporting crappy sequential software about as far as it can go12:30
lekernelwell... if you have OOO execution, delay slots sure make no sense12:30
juliusbyes, as someone who writes, tests and debugs cores, I would eliminate the delay slow12:30
lekernelbut I wouldn't toss it as a definitely crappy idea either12:31
terpstrafair enough12:31
lekernelI think it still does some good in some cases.12:31
juliusbfor OR2K, we propose eliminating them http://opencores.org/or2k/OR2K:Community_Portal12:31
terpstrai agree it is a nice way to avoid the wasted instructions you otherwise have12:31
juliusbyes,for the simple 4/5 stage pipelines, they do gain you some advantage compared to not, there12:32
lekerneljuliusb: do you want to help with the synthesis toolchains?12:33
lekernel(speaking about delay slots: for the OR2K, sure, eliminate them)12:34
juliusblekernel: probably not right at the moment, sorry, I was just curious to see how it was looking12:35
juliusbperhaps in a while, though12:35
juliusbI think it's definitely needed and would be very cool12:35
lekernelthere are some relatively simple things to do, like implementing Verilog case statements12:35
juliusbmainly i'd be interested to see an open source synthesis engine12:35
juliusbto check the impact of various design choices12:35
lekernel(all it's needed is translate those statements to IR muxes)12:35
lekernelat least for now, then we'll see how to do things like FSM extraction12:36
juliusbcool, if I get some time i'll let you know, will find out how to get started12:38
lekernelok. just ask here or on llhdl@lists.milkymist.org if you have questions or problems.12:39
juliusbwill do12:40
lekernelbtw, I was a bit stuck lately with the placement engine12:41
lekernelI wanted to do post placement packing, but this is rather hard especially with the current chip database architecture12:41
lekernelso I think i'll revert to good old pre-placement packing heuristics for now12:42
lekernelnot sure how good it's going to work with the relatively complex s6 slices, but we'll see12:42
lekernelmaybe it works great12:42
lekernelas a matter of fact, I think Altera has even more complex logic blocks ("LAB clusters" or something)... and it's not clear how they pack them12:43
lekernelalso, with post placement packing, I'd lose one of the potential benefits of clustering, which is that the placer algorithm can be faster because it has to deal with fewer elements12:44
lekernelso perhaps it's simply a bad idea after all12:44
terpstralekernel, why does an LM32 dcache read (lw instruction) take 3 cycles for result? X stage calculates address, M stage touches cache.... what happens in W stage?13:06
lekernelwrite to register file?13:07
lekernelI don't know13:08
terpstrabut at the end of the M stage it could have used the bypass13:08
terpstrajust like the 2-stage shift instruction does13:08
terpstrathere's an "align" step in the block diagram13:10
terpstraso D fetches base register, X adds offset, M fetches the cache, and W 'aligns' the result (and writes back to register file at end of cycle)13:10
terpstrawhat is this magical align?13:11
lekernelI guess this is for reading bytes or 16-bit words on any offset13:11
terpstraand sign extension / etc13:12
terpstramakes sense13:12
lekernelhi xiangfu14:29
lekernelhi guyzmo14:42
guyzmohey :)14:45
guyzmosorry, was plugging in stuff14:45
guyzmodamn, so sad rlwrap can't work over flterm :/14:52
guyzmo(and all control characters just output garbage)14:53
guyzmocan't get the led par to lighten up :/15:04
lekerneldid you try it in flickernoise?15:28
guyzmonot yet15:33
guyzmoof course I'm gonna try it15:33
lekernelcontrol panel -> dmx -> dmx table (called "dmx desk" if you have upgraded, but I don't want to be negative here, but I'd tend to bet you did not)15:34
lekernelfortunately the dmx desk works with all released versions :-)15:35
guyzmodamn, why did I forget my DMX cable :-S15:49
kristianpaulFallenou: (registers) like in the drivers and sys_conf.h?16:15
kristianpauloh, yes i think16:20
guyzmonone of my XLR cables work with DMX signal18:37
guyzmothough I remember we had one of them working18:38
guyzmoI will have to get one cable from the Gaîté Lyrique tomorrow18:40
lekernel"Since writing it I've made features at 5 micron half-pitch using the camera-port method, and am about to buy a 1-watt 385nm LED as an exposure source. This is way more power than I need so I will be able to use a nice thick diffuser on it. Once the exposure lamp is fixed I should be able to make 75 \lambda square dies at 5 micron resolution using the 40x objective, or 20 micron using the 10x."22:14
wpwraklekernel: the first one looks like a forest ;-)22:27
lekernelhi azonenberg22:28
lekernelwelcome, honored to see you here :)22:28
lekerneli'm sebastien22:28
azonenbergah, k22:28
azonenbergLets move our discussion to here rather than fb chat so other people can see22:28
azonenbergThe paper i sent you only describes my work at the 15um node22:29
lekernelok :)22:29
azonenbergThough i did outline the process that I later reached 5um at22:29
lekernelhow do you engrave through the silicon?22:30
azonenbergI plan to open the project as much as possible btw, all tools etc will be released under an open license (probably BSD or similar)22:30
lekernelexcellent :)22:30
azonenbergRead the FB note (which i need to post publicly somewhere)22:30
azonenbergLong story short, apply hardmask (probably Ta2O5) to the silicon by spin coating and heat treatment22:30
azonenbergSpin coat photoresist over that22:31
azonenbergexpose and develop22:31
lekerneli see22:31
azonenbergEtch hardmask with 2% HF (Whink rust remover, same stuff jeri uses for gate oxide)22:31
azonenbergThen etch the silicon using 30% KOH / 15% IPA / 55% water at ~80C22:31
lekernelsorry about the dumb question, i'm still going through the pile of material and links on your website and fb :)22:31
azonenbergYou cant use KOH directly because it will attack the resist22:31
azonenbergLol, no questions are dumb22:32
azonenbergFor the record i have no formal training in EE myself :P22:32
azonenbergmy BS (and PhD in a few years) will be in comp sci22:32
azonenbergAnyway so the nice thing about KOH is that its very anisotropic22:32
azonenbergFeCl3 and similar etchants for copper, if you've ever done home PCB fab, are isotropic - they eat equally in all directions22:33
azonenbergSo you get rounded sidewalls and such22:33
azonenbergBut KOH eats along the <100> crystal plane nearly 100x faster than <111>22:33
azonenbergAnd <110> is a hair slower than <100> but not by too much22:33
lekernelcool. I talked about this to a fab employee, and he told me I'd never get any good anisotropic etchant because they are super expensive, hard to buy, etc.22:33
lekernelif it's just KOH, well... :)22:33
azonenbergIf you get <110> you can go straight down (assuming your features are parallel to the <111> plane>22:34
azonenbergh/o let me send you a paper22:34
azonenberg"Fabrication of very smooth walls and bottoms of silicon microchannels for heat dissipation of semiconductor devices"22:34
azonenbergLook at the figure they have in there (fig 9 i think?) - 400 micron deep etch with almost vertical sidewalls22:34
azonenbergif i didnt know better i'd say it was made with RIE22:34
azonenbergthat's what i used as the starting point for the comb drive process i have on fbook22:35
Action: lekernel warms up his university proxy to get through the cretinous sciencedirect paywall22:35
azonenbergI have an openvpn server running at a friend's house22:36
azonenbergthe machine in my office on campus, and my laptop here, tunnel into it22:36
azonenbergthen the office machine advertises routes to most journal websites ;)22:36
lekernelthat's more sophisticated than I do... I use ssh redirect and /etc/hosts22:37
azonenbergI run OSPF http://pastebin.com/Tn4T2e8k22:39
azonenberg.11 is the vpn addy of my box on campus lol22:39
lekernelhm... can't reach any server at uni tonight22:40
Action: azonenberg mirrors22:40
lekernelhave you done multilayer yet?22:41
azonenbergI havent done any etching yet since i cant afford the materials until my next payday lol22:42
azonenbergLook at the date on the paper22:42
azonenbergi only got litho working reliably last week22:42
lekernelyeah, saw it :)22:42
azonenbergthis was an unsolved problem for months22:42
lekernelman that's awesome work22:42
azonenbergcant belive how simple the solution turned out to be lol22:42
lekernelbest hack i've seen lately :-)22:42
azonenberglekernel: http://colossus.cs.rpi.edu/~azonenberg/mirror/smoothwalls.pdf22:45
lekerneldo you think you can etch vertically like this in e.g. SiO2?22:46
azonenberglekernel: Why not?22:46
azonenbergI can buy KOH for $4 a pound22:46
lekernelI don't know... since you are relying on the crystal structure22:46
lekernelwhat happens when you grow oxide on a wafer? do you have a neat crystal structure or a messy one?22:47
azonenbergFirst off, i will be buying wafers aligned to <110>22:47
azonenbergProbably these http://www.mtixtl.com/sisinglecrystalsubstrate110orn10x10x05mm1spundoped.aspx22:47
azonenbergthey arent technically wafers as they arent round, but <110> is hard to find in full wafers for decent prices22:47
azonenbergAnd i will not be growing oxide, also22:48
azonenbergiirc they used Si3N4 deposited by LPCVD as a hardmask, but i dont have CVD capabilities22:48
azonenbergSo i'll be spin coating this stuff http://emulsitone.com/taf.html22:48
lekernelso you want to focus on MEMS?22:48
azonenbergAfter heat treating it forms Ta2O5, which is pretty easy to etch with HF22:48
lekernelgrowing oxide is mandatory for most transistors (afaik)22:49
azonenbergBut it's resistant to alkaline etches22:49
azonenbergDielectric is, it need not be SiO222:49
azonenbergtantalum pentoxide was actually considered as a high-K dielectric for DRAM a while back - it would work22:49
azonenbergBut emulsitone also sells a SiO2 coating solution22:49
azonenbergAnd, more importantly, i plan to buy a furnace i can do thermal oxidation in22:50
azonenbergI just dont have $1200 to spare yet22:50
azonenbergi can do bulk micromachining for much less ($500 or so)22:50
azonenbergIncluding all of the consumables22:50
azonenbergCMOS is definitely on the to-do list but its down the road22:50
lekerneldo you know about this? http://visual6502.org/22:51
azonenbergamong other things because transistors are so sensitive to trace metal contamination whereas MEMS are less so22:51
lekernelthere are also the 4004 masks published by Intel for you to chew on :-)22:51
azonenbergI do reversing too22:51
azonenbergLol, um22:51
lekernelless transistors than the 650222:51
azonenbergyou *do* know that one of my dreams has been to make a 1:1 scale model of the 4004?22:51
azonenbergfully functional22:52
lekernelhaha :)22:52
azonenbergBut like i said mems is easier so that comes first22:52
azonenbergno need for doping or tons of masks, the process i'm looking at only needs three masks and only one even somewhat precise alignment step22:52
azonenbergthe first mask is contact litho at \lambda = 200um lol22:52
azonenbergjust thinning the wafer in the middle and leaving a thick rim around the edge for handling22:53
azonenbergthen the through-wafer etch for the fingers followed by metal 122:53
azonenbergthough, as you saw in the paper, getting sub-5um alignment will be pretty easy22:55
lekernelanother thing that could potentially be interesting is MMIC's22:55
lekernelmicrowave ICs22:55
lekernelthose are a pain to buy22:55
azonenbergoh... Those will be trickier - tighter tolerances22:55
lekerneldo you think so?22:55
lekernelmaybe the transistors are22:55
azonenbergOnce i get the basic process working i'll see where it goes lol22:55
lekernelbut a big MMIC advantage is in the ability to print microstrip lines with more precision than on a PCB22:56
azonenbergGood point22:56
azonenbergActually, funny thing - i was thinking of making a hybrid of PCB and IC technology at some point to do massively multilayer boards22:56
lekernelI actually do not know how to build a good microwave transistor22:56
azonenbergStart with dual layer FR4 with copper on both sides22:57
lekernelbut it does seem to use very nasty chemicals like germane gas22:57
azonenbergPattern your metal 1 and 2 (for power distribution)22:57
azonenberglay down oxide on top of M222:57
azonenbergsputter or evaporate a micron or so of Al or Cu, etch M322:57
azonenbergrinse and repeat lol22:57
lekernelgermane is one of the few chemicals I dare not touch, close to sarin gas and the like22:58
azonenbergWhat about concentrated HF?22:58
azonenbergor SiH4?22:58
azonenbergI draw the line at 2% HF myself lol22:58
lekernelHF is still a lot less dangerous than germane22:58
lekerneleven concentrated HF22:58
azonenbergThey use that for ion implantation22:58
azonenbergArsine too22:59
azonenbergNeither of those are healthy to be around22:59
azonenbergMy process will be diffusion based using spin on dopants though22:59
azonenbergLess precise but safer and requires less fancy equipment22:59
azonenbergjust HF wet etch the doped oxide film, coat undoped oxide around it, and heat for a while22:59
azonenbergAccording to wiki, GeH4 is used for CVD epitaxy in a similar manner to SiH423:01
azonenbergSo that means they're using germanium based substrates23:01
lekernelso no CVD etc.?23:02
azonenbergI'm ranking processes in order of preferenace23:03
lekernelwhat about metal layers? how can you do them without PVD?23:03
azonenbergSpin coating is pretty much impossible to avoid and easy to do (though precise coating thickness control will be a bit tricky until i get a speed controller)23:03
azonenbergMetalization will be done by filament evaporation or DC sputtering23:03
azonenbergI'm exploring both in parallel and whichever one starts working first is the one i'll use23:03
azonenbergthough eventually i want both23:04
azonenbergThermal diffusion id going to be necessary for CMOS but not MEMS23:04
azonenbergor at least, not the comb drive23:04
lekernelheard of this? http://www.gdiy.com/projects/thin-film-sputtering-machine/index.php23:05
azonenbergNo, actually, I havent23:05
azonenbergBut i do have a friend doing research in sputtering23:05
lekernelthere you can get your metal layers :-)23:05
azonenbergMetalization was my second area to focus on after litho23:05
azonenbergTo be done in parallel with etching23:06
azonenbergI really havent studied it in nearly as much depth lol23:07
lekernelat electrolab (a hackspace near Paris) someone got their hands on a couple of turbopumps. we haven't used them yet, though.23:10
lekernelI was actually thinking about doing the sputtering first23:10
azonenbergI was planning to do thermal evaporaition initially, actually, since i thouhgt it would be easier23:11
lekernelyeah, maybe I'll start with that too :)23:11
azonenbergbut if you get sputtering working I might send you guys a few dies to metalize lol23:11
azonenbergthe tricky thing with sputtering is gonna be doing it *cheaply*23:12
azonenbergFor $3.5K - $5K you can buy a small sputtering rig from MTI or similar23:12
lekernelmy #1 problem is time (and then money to build such expensive stuff). i'm doing too much stuff ...23:12
azonenbergHomebrewing cheaper is not going to be easy23:12
azonenbergBut evaporation looks like it will be a lot easier to do cheaply23:12
lekernelyeah probably23:12
azonenbergYou need a high current, precisely controlled power supply (may be possible to adapt one designed for welding, i may build one for the low-power ~100W prototype)23:13
lekernelwith a little effort we can also probably get an old evaporator from the 70s too23:13
azonenbergA 2-stage rotary vane vacuum pump will get me down to ~40 mtorr, i dont know if thats deep enough23:13
azonenbergTed Pella will sell tungsten boats, filaments, etc for a decent price23:13
lekernelwe merely need to rent a van and drive it on some 600km to pick the evaporator up :)23:13
azonenbergAs with wire / pellet charges for evaporation23:13
lekernelbut again there are time problems23:13
azonenbergI projected (given the pump and vacuum gauge i am thinking of borrowing from a friend) that building a working evaporator would cost ~$1.5K23:14
azonenbergmaybe only $1K23:14
lekernelhttp://paillard.claude.free.fr/ is very cool too23:14
lekernelthat guy built his vacuum pumps himself23:15
lekernelincluding a molecular one23:15
azonenbergNice, but i dont know french :(23:15
lekernelunfortunately he's stopped doing this23:15
azonenbergAnd i dont plan to build a pump since i can get access to one23:16
azonenbergOr, at least a roughing pump23:16
azonenbergif high-vac turns out to be necessary i may try my hand at makign a diffusion pump23:16
lekernelsure. but vacuum pumps are otherwise expensive like hell, so it's good if there is a DIY alternative23:16
azonenbergunitednuclear sells a 2-stage rotary vane roughing pump for $29523:17
azonenbergi cant imagine DIYing one for less23:17
lekernelin fact, vacuum anything is expensive like hell, even when it clearly needs not to be23:17
azonenbergBut i am not really focusing on vacuum too much yet23:17
azonenbergI'm designing processes in the order that i'd use 'em23:17
azonenbergand next after spin coating and exposure is etching23:17
lekernelthat guy http://benkrasnow.blogspot.com/2011/03/diy-scanning-electron-microscope.html uses spark plugs as voltage feedthrough23:18
azonenbergYeah, i saw that one23:18
lekernelthose otherwise cost around 100-200¬ or so at a professional vacuum equipment manufacturer23:18
azonenbergNot bad at all23:18
lekernelrotary vane pumps aren't the worst... the main problem is turbomolecular pumps which are around $800023:20
azonenbergTurbopumps are not cheap, that's for sure23:21
lekerneland also seem to be easily damaged if for example your vacuum is suddenly broken with the pump running23:21
azonenbergBut do you really think you can build one?23:21
azonenbergAnd yes, that will kill them23:21
lekernelwell, apparently Claude Paillard did something like that23:21
lekernelyeah :)23:22
lekernelhis work is amazing23:22
azonenbergBut the question i'm asking right now is, how high vacuum is needed for basic evaporation?23:22
lekernelunfortunately he did not publish all the details and he's no longer into that23:22
azonenbergIf I purge the chamber with argon or something to remove any traces of oxygen23:22
azonenbergthen pump down to 40 microns vacuum23:22
azonenbergwill that be adequate?23:22
lekernelthat's what I'm thinking too. but why is it that no professional installation does that?23:22
azonenbergI mean, i've seen DC sputtering done at ~100 mtorr23:22
azonenbergIts probably less efficient, slower deposition, etc23:23
azonenbergBut for DIY the first rule is "make it work"23:23
azonenbergnot "make it cost effective for mass production"23:23
lekernelwell, even in research labs when mass production isn't a priority, all sputtering i've heard of is done with first high vacuum then letting a little bit of noble gas in23:24
azonenbergI'm not sure why23:24
lekernelI'm asking myself the same question.23:24
azonenbergBut RF sputtering is normally done at much lower (1-2 mtorr) pressures23:24
azonenbergi'll be doing DC23:24
lekernelbut no one has been able to answer it yet23:24
azonenbergYep, one more item on the todo list23:25
azonenbergI want to set up some kind of proper website for coordinating this, now that i have people interested from all over the place23:25
azonenbergright now i'm the main guy pushing the research, i'm bouncing ideas off of two friends who live near me23:26
azonenbergand there are a bunch of folks i know online who i talk to about it here and there23:26
azonenbergBut there's no central location for posting status reports etc23:26
azonenbergAny recommendations on some kind of web-based tool that will work well for it?23:27
lekernelmaybe for starters, just a mailing list with public archives?23:27
azonenbergI set up the group "homecmos" on google groups but there's been zero traffic so far lol23:28
azonenbergi havent tried using it much23:28
lekernelpersonally I don't really like google groups... good old mailman is best23:28
azonenbergWant to host the list somewhere? Be my guest23:29
lekernelI can probably create you a mailman list on lists.milkymist.org23:29
lekernelif you want...23:29
azonenbergthat might work... right now i'm still trying to figure out what kind of web presence to have23:31
azonenbergright now its just static html hosted from my office box lol23:31
azonenbergany wiki hosts to recommend?23:31
lekernelotherwise I think sourceforge also provides mailing lists23:32
lekernelwiki... hmm... actually, no23:32
lekernelI use mediawiki and it's awful because of spam problems23:32
lekernelit would not even let you mass delete accounts or edits and comes with no captcha by default23:32
azonenbergAs a minimum I want a wiki (posting restricted to registered users probably) and a mailing list23:32
lekernelso a default mediawiki installation is unusable because it gets daily vandalized by bots and you spend hours fixing it23:33
azonenbergYeah, i run default mediawiki for one project but its internal and on a LAN-only server23:33
azonenbergbehind a firewall23:33
lekernelthere's also github which provides a wiki23:33
azonenberggrrrr git23:33
lekernelthe nice thing is that the wiki is backed by a git repository23:33
Action: azonenberg prefers svn23:33
lekernelhuh? why?23:34
lekernelsvn is slower and more unstable than git23:34
azonenbergNever liked distributed vcs in general23:34
lekernelwell you can forget about the distributed features if you don't need them23:34
azonenbergi'm a big fan of continuous integration so i want everyone committing to trunk so the code gets as many eyes on it as possible early on23:34
azonenberggit seems to encourage branching to an extent i dislike23:34
lekernelthat is possible with git as well23:34
azonenbergbut i dont want to start any religious wars lol23:34
lekernelwell, personally when I switched from svn to git I don't understand how I have endured svn that long23:35
lekernelcorrupt repositories (both on client on server), slowness, bugs, segfaults, crashes, etc.23:36
lekernelI do not use the distributed features of git a lot either (though being able to commit while offline is nice), and use it mostly for its speed and robustless23:36
azonenberglol i've never seen any of those, but w/e23:37
azonenbergRight now i have an svn repo but its pretty empty, migrating wouldnt be hard23:37
wpwraklekernel: never has stability issues with svn. but i agree on the slowness. once you get used to the speed of git, svn becomes quite unbearable23:37
azonenbergI want the wiki and mailing list first, vcs can be hosted wherever23:37
azonenbergthoughts on google code? They support VCS backed wikis23:38
lekernelwpwrak: well you can try to grab the milkymist tree and commit it in one go to a svn repository. there's a good chance this will fail.23:38
lekernelwith git no problem23:38
wpwraklekernel: hehe, i'll pass :) but we used svn quite extensively at openmoko for many years and i don't remember any stability issues. we actually had more trouble with git :)23:39
azonenbergSo I think i'm going to go google on this23:41
azonenbergi already have the group so i'll google-code the wiki23:41
lekernelif you have a good wiki engine to recommend (mediawiki isn't) I can also host it for you23:42
azonenberglekernel: I dont, unfortunately23:43
azonenbergnice thing about google code is that the wiki is VCS backed23:43
azonenbergSo you can even send out commit emails on wiki changes etc23:43
lekernelbut I don't want to have more mediawiki problems. one wiki is already enough to get me pissed.23:43
azonenbergYeah lol23:44
wpwraklekernel: to paraphrase a joke i once heard about IBM: mediawiki is not a necessary evil. mediawiki is not necessary.23:45
lekernelthoughts about pmwiki?23:47
azonenberglekernel: Never heard of it, i think i'll run with google for a while and see how it works23:48
wpwrakazonenberg: btw, i agree that vcs-based makes a lot of sense. particularly if you also have an offline renderer/formatter such that you can edit your pages locally and just commit23:49
lekernelbtw use of mm w/ video input and camera: http://www.vimeo.com/2296610323:53
wpwraklekernel: (video) nice !23:58
--- Thu Apr 28 201100:00

Generated by irclog2html.py 2.9.2 by Marius Gedminas - find it at mg.pov.lt!