Notes from the weekly DAS/2 teleconference, 30 Jan 2006. $Id: das2-teleconf-2006-01-30.txt,v 1.1 2006/01/31 18:48:53 sac Exp $ Note taker: Steve Chervitz Attendees: Affy: Steve Chervitz, Ed E., Gregg Helt Sanger: Andreas Prlic Sweden: Andrew Dalke CSHL: Lincoln Stein UC Berkeley: Nomi Harris Action items are flagged with '[A]'. These notes are checked into the biodas.org CVS repository at das/das2/notes/2006. Instructions on how to access this repository are at http://biodas.org DISCLAIMER: The note taker aims for completeness and accuracy, but these goals are not always achievable, given the desire to get the notes out with a rapid turnaround. So don't consider these notes as complete minutes from the meeting, but rather abbreviated, summarized versions of what was discussed. There may be errors of commission and omission. Participants are welcome to post comments and/or corrections to these as they see fit. Topic: Code Sprint - final preparations --------------------------------------- * Daily teleconf for sprint 6-11 Feb (next week): 9am PST, 5pm UK * Same teleconf line as used for our weekly calls [A] Hinxton folks will set up IM Likely Attendees on west coast: Berkeley: Gregg, Nomi, Ed E. UCLA: Allen, Brian GH: Will focus on bringing client up to spec, work on code for affy server. NH: DAS/2 client for apollo NH: Can read das2xml now, can't do get requests. Hoping for server to test on GH: how fast can we get a server that works sort of right AD: can simply pass around xml files NH: only have a few weeks for grant report. would like to interact with actual server rather than serving xml files GH: review won't happen till march, so timing should work out LS: will be standing by GH: Best bet to get server up and running early: either allen modify biopackages, or gregg modify affy server. will coord with Allen, focus on getting affy server sooner than later. [A]: Gregg will coord w/ Allen to get DAS/2-compliant server in place for sprint. Topic: Latest changes to the spec ---------------------------------- See Andrew's recent post to cvs and explanatory email to list. AD: before the sprint, maybe clean up descriptions, "xxx clarify this", etc. otherwise the rough shape is good to go. GH: Would like to walk through client-server interactions a das2 client would do, doing standard things for genome browsers. This will test: Does the spec do what it needs to do? 0. client discovers where it needs to go for servers (somehow, for now -- eventually using registry/discovery mechanism) 1. sequence request - gets all available genomes, and their versions (this is what the source request was). client doesn't worry about seq/source/version since it get sufficient skeleton from the seq request response. 2. segment request - given that user has chose a genome and version, now want to figure out what annotations are available for it and present to user. AD: you want features that are considered top level? GH: no. want to get a list of chromosomes back. GH: if there is a genome available and the coords are for contig, will the coords be based on the assembly or contig? LS: how do I get from an entry pt to features? why do we need an assembly to get to features? do you have a length? LS: we've made an assumption that all features are made at the genome level. This was a basic assumption we made two years ago. GH: as long as there is an assembly available. LS: when there's no assembly available (chromosomes) top-level could be 700 contigs, listed largest to smallest in the UI, annotations on each contig (like little chrms). Assembly could be a feat type that spans the whole chromosome. GH: we talked about this 1 mo ago. AD: for feats with multi coord systems. LS: you shouldn't have to get assembly to get feat. only need entry pt segment and it's length. The AGP assembly is for people interested in details of assembly, not nec the browser. One could turn an assembly into feats + sub feats using GO nomenclature. GH: the only case where there is an assembly but server serves up contig coords is ensembl. How has this worked out? LS: the requirement to map everything back to the assembly was too burdensome. GH: will this be ok for ensembl to move back to assembly coords? AD: the spec allows you to use relative coords. LS: done on server side. no guarantee that any server will support this. all we can rely on is that we get a series of entry pts, you can use these entry pts to get features, in their coord system. GH: ok. (continuing with the walkthrough) 3. Assume in the das/2 client that the list is what is returned in the segments (request). User browses around, finds region of interest. 4. version/features request to find feats within region based on feat filter 5. before this is a types request to find available types. The client also determines any alt formats for these types and style sheet suggestions for how to render these feats. GH,AD: server can define style for a particular feature. e.g., label one specific thing red and all other things of that type blue. AP: don't mix feats, annots and way they are displayed. all display related info should be in style sheet. GH: style sheet stuff entirely separate. separate request to get all styles. It says: for this type id render it this way. GH: to make it easy on users and me, feat filts at first are based on the current view of a chromosome to define the overlap query. then there are a bunch of toggles for which types (for the type filter). In the client, there is a separate das query based on DAS type - i.e., one type per query for caching purposes (implementation detail). ontology knowledge: not in the client yet. GH: use case: bunch of different gene predictions with same ontology type, but type id is genscan, exon array, genie, etc. AD: other case: display alignment, different scores, want different colors based on the score. GH: doable GH: idea is that stylesheet is extensible this way. you can say things and as long as client understnds, this is ok. AD: yes. images, embedded svg, etc. but good to have a base of things all clients can use for extension, if you support it, do this, otherwise, do that. GH: first pass client: pays attention to links, if user right clicks on feature, presents these links that user can navigate to via web browser. how do we do this? AD: doc_href one per feature, xid zero or more - simple href + something else - no title description to say what this something else is. GH: would be good to have something. can pop up all of these in window on right click, xid diff that doc_href. AD: notes for additional info. - not added in the spec. text only, embedded html? zero or one or more? if arbitrary xml, then no one knows how to represent. LS: note is unformatted human readable text. UTF AD: unicode? GH: ok LS: can use a link to point to more structured data. SC: so what is the cardinality: doc_href=1, xid=0..* AD: xid - would be good to have more metadata. just says it is used for gene das now. GH: would like to have href and label AP: thumb nails, mouse images, click on thumbnail to get full image AD: key value prop list, key is string, value is string. Also show how to have value of string, href, arbitrary blob. Could put that in the property table. could also extend the feature AP: href=xxx, image=... link is a url, i.e. AD: is this for user-defined notes? AP: there should be ... LS: cross reference to another object. several ways to do it. AD: is this something just for you client of something that others could take advantage of? AP: multiple ways ok, one way: client agreed on GH: afraid if we solve it for the general case we'll make it look like rdf. don't want every client to have to deal with this. let clients who do rdf discover the rdf in the embedded stuff. AD: would like to get something out, usable now. Permit extensions. AD: tree of /segment, /feature, /type query urls - feature url to query has plural: /segments? /features? /types? GH: for feature query to get detailed info for features, are we going to use xml base, resolve URI to get the detail. Then we don't need the versions/feature/featureid request. reason: leaves loophole to use LSIDs to identify features. AD: this is just a recommended layout. can implement your own system if you want to. GH: would like to see this qualified as "suggested". As long as you can resolve a URI for the feature, it serves the same purpose. GH: we have enough to go on now. can start implementing client tomorrow. NH: hope so. GH: other q's about spec? AD: types issue. query for everything that maatches this type, do you give the server the full http:// string to identify the type you want? Currently, we allow the use of a short name only for things that have their own name (for features) but didn't do it for types. SC: what about using an ontology id for the type? GH: need unique ids for types that are separate from ontology. GH: the basic issue is, you don't want to have a long escaped string. We could have a name field in the type. not visible to human. GH: cutting off everything but what's after the last slash in the URL. AD: can add this. [A] Andrew will a short name for types Wrap up: Don't forget: dialin 9am pst/ 5pm gmt for code sprint starting next Monday 6 Feb.