Notes from the weekly DAS/2 teleconference, 4 Dec 2006 $Id: das2-teleconf-2006-12-04.txt,v 1.1 2006/12/04 18:47:10 sac Exp $ Teleconference Info: * Schedule: Biweekly on Monday * Time of Day: 9:30 AM PST, 17:30 GMT * Dialin (US): 800-531-3250 * Dialin (Intl): 303-928-2693 * Toll-free UK: 08 00 40 49 467 * Toll-free France: 08 00 907 839 * Conference ID: 2879055 * Passcode: 1365 Note taker: Steve Chervitz Attendees: Affy: Steve Chervitz, Gregg Helt CSHL: Lincoln Stein Dalke Scientific: Andrew Dalke Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Agenda ------- * HTML retrieval spec discussion * Status reports Topic: HTML spec finalization ----------------------------- gh: has everyone had a chance to check out the revised html version of the retrieval spec since steve's changes? ad: looks clean sc: still some XXX comments here and there. [A] Gregg will add more alignment examples to html get spec, cigar string [A] All will take a final look at html get spec, paying attention to XXX flags. gh: need to spend another day editing. It's a good sign that no one has felt the need to change anything. gh: now we have other docs to do: writeback, stylesheets, etc. gh: finished the funding note for cshl sent to lincoln. ls: allen will be able to start again in a few weeks. cannot make any obligations to people now unless I can show there is money for it. had to ask him to stop working immediately. Topic: status reports ---------------------- ls: (re brian gilman, hapmap). Brian submitted a das2 pluging for caCORE and a patch to the NCICB to allow caCORE to use his plugin. has bee problematic b/c they wanted it in time for their releasse, and brian could not get their system to build for about 1 month. got code in by our deadline, but not in time for their release. uncertain when NCI will do a point release to bring this code in. people need to d/l the NCI source code, apply the diff, and re-compile it, which is not trivial as their build system is quite complex. so the code is there in principle, useable in practice? now he's working on das2 servers for hapmap and vert promoter db. has the data, using allens biopackages server, data should come up soon. I suspect they will reject what he did. he sent them uml docs and names of external libs, then started working on code, then chief s/w guy at NCI said they wrote the plugin layer based on brian's docs. once brian got the thing to compile, he realized it didn't work. so it's been tough working with NCI s/w devs, they are annoyed at us given out delay. doesn't impede das2 sources, but impedes the ability of this highly visible toolkit to use das2. gh: anything we can do to encourage? ls: we'll see in a few days the reaction to brian's work. complication - caBIG coordinator has left, new guy in place. possible a note from Tom Gingeras would help. gh: definitely. The primary way to look at affy tiling data is via IGB. It's important to be able to view hapmap data within IGB. ls: The other way around is important as well: for the core caBIG to have access to tiling data, they need the das2 client layer. gh: can get something from tom on that too. It's on the agenda for the affy server to server up tiling data eventually. [A] Gregg get letter from Tom Gingeras to support das2 in NCI caCORE ls: update on perl das2 client - still where I left it after last code sprint. needs 3-4 days of work. will go higher in priority when hapmap and vert prom db are up, for access to that data. gh: new IGB release over thanksgiving break out on 11/27, (Ed E. and I). Includes das/2 fixes and some new things: using das/2 to pull in data for affy chip data. Some background: to generate results for affy expr and exon chips is 'expression console' that generates results in CHP format, which IGB can read, problem is that it has no genomic locations, just probe set ids and p-values from experiments. So now, when IGB loads a chip file it finds the matching coord data via the netaffx das server, merges based on ids, to show results has heat maps, or graphs. integrates in nicely with das/2 client code in IGB. runs through this optimizer, doesn't reload the data for that session. can cache data for whole chromosome on your machine. uses alt file formats to retrieve in an optimize binary format. lazy loading, only for the chrm you are looking at. pretty happy as a good use of das/2 completely behind the scenes. gh: update broke the caching system that IGB is using, data retrieve via urls on local hard drive. now file names are too long using full type/segment uri's in the das/2 queries. so my url-> filename conversion got too long. using shortened versions. works for netaffx server but not with biopackages server. working on a fix soon. sc: are there java libraries to create a md5 checksum on the full long name? gh: maybe, or I may have a way to map filename to integers. need to investigate possible strategies. gh: also did some fixes on the das2 server. sc: updated the affy das servers to include the latest rat genome assembly release (ucsc rn4, Nov 2004). added to our das/1, das/2, and quickload servers. Added probe/probeset data for all exon arrays to das/2 server. Fixed a bug in the exon array names to permit gregg's genome location lookup tool to work. gh: we need to map the chip type name (which we have no control over) to the 'type' name in das2. sc: affy das/2 server has just a subset of the genomes and annotations available via quickload. das/1 server has support for 3' IVT array design data and exon arrays, not all arrays or genome versions supported due to memory limitation on the machine. gh: I'm hoping to get signoff for the new hardware order on wed. A quad opteron 32g expandable to 64g, should be nice. sc: also replied to brian osborne on discussion list re: some das1 vs 2 issues. Seems like a good candidate for a faq item. We should set up a faq, ideally on the wikified version of biodas.org. No progress on the wikification project. Need to poke open-bio.org admins again. [A] steve will set up faq on biodas.org [A] steve will look into wikification of biodas.org ad: working on proxy for translating das2 type queries into das1-style queries. on servers that are on andreas' registry. asked andreas about issues from various servers that appear to be not working. gh: can andreas detect non-operating servers via automatic server checking? ad: the segments doc on das1 text id is gene id 'located on chromsome 5' so a long string for segment id. valid xml but requires human to interpret. ad: also taking people's das1 modifications, using my handwritten code to apply their extensions, e.g., for ontologies. figuring out how to make their adaptation work nicely. mostly just saying, "there was extra data, you figure out how to use it." ad: code for proxy is in dasypus sourceforge CVS. In two parts: manual part goes to registry, updates local db. other part does proxying of das1 system. haven't documented how that works. [A] Andrew document das/1 proxy system in the faq (when faq is ready) [A] Next meeting in two weeks (18 Dec 2006)