Visualizing Coordinates: Difference between revisions
From genomewiki
Jump to navigationJump to search
(Created page with "Regarding strand coordinates, there are generally two ways in which this can be done: #1. Specify coordinate on positive strand, and then after the fact, note whether it is actu...") |
No edit summary |
||
| (2 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
<pre> | |||
Regarding strand coordinates, there are generally two ways in which this can be done: | Regarding strand coordinates, there are generally two ways in which this can be done: | ||
| Line 39: | Line 40: | ||
Negative strand coordinates also have the same range the negative strand, of course. | Negative strand coordinates also have the same range the negative strand, of course. | ||
So | So | ||
s = C - E | s = C - E | ||
e = C - S | e = C - S | ||
| Line 52: | Line 53: | ||
Note that if you use one-based closed coordinates then the picture | Note that if you use one-based closed coordinates then the picture | ||
looks like this: coord range both strands: [1,chromSize] | looks like this: coord range both strands: [1,chromSize] | ||
<pre> | |||
e s ...321 (neg strand coords) | e s ...321 (neg strand coords) | ||
eziSmorhc=C YYYYYYY | eziSmorhc=C YYYYYYY | ||
| Line 65: | Line 66: | ||
So in these coordinates, there is usually some extra +1 or -1 that is needed | So in these coordinates, there is usually some extra +1 or -1 that is needed | ||
in coordinate calculations. | in coordinate calculations. | ||
</pre> | |||
Latest revision as of 20:28, 13 February 2014
Regarding strand coordinates, there are generally two ways in which this can be done:
#1. Specify coordinate on positive strand, and then after the fact,
note whether it is actually on the negative strand. We typically
use this one very much, probably because it makes it easier to
compare coordinates, especially if you don't care what strand it is on.
#2. Specify the strand first, and then use the coordinates of that strand.
Both are in use in general and in different places.
If #2 is used and it is on the negative stand, people use the phrase
that it is in "negative strand coordinates."
Cases that I can remember that do this are the chain files.
Also, bizarrely enough, in the psl format, although the
main start and end coordinates are in positive strand coords
(probably to allow rapid coordinate compares while looking
for overlaps at the whole-gene level).
the actual block starts, and their order, are in negative strand
coordinates.
To convert from #1 to #2, you generally takes
start2 = chromSize - end1
end2 = chromSize - start1
To make my graph easier in text,
lets say that S and E are start1 and end1 on pos strand coords,
and s and e are start and end on neg strand coords.
e s ...210 (neg strand coords)
YYYYYYY
eziSmorhc=Cnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
ppppppppppppppppppppppppppppppppppppppppppppC=chromSize
XXXXXXX
012... S E (pos strand coords)
with our zero-based half-open coordinates, the positive strand coordinate
runs from 0 to chromSize-1, that is [0,chromSize) which is also [0,chromSize-1].
Negative strand coordinates also have the same range the negative strand, of course.
So
s = C - E
e = C - S
With form #1, we say it is at S,E but by the way, it is really on the neg strand (-).
With form #2, we say it is on the negative strand (-), at coordinates s,e.
So, do you want the coordinates first, or the strand? Either way can work.
---------
Note that if you use one-based closed coordinates then the picture
looks like this: coord range both strands: [1,chromSize]
<pre>
e s ...321 (neg strand coords)
eziSmorhc=C YYYYYYY
nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
pppppppppppppppppppppppppppppppppppppppppppp
XXXXXXX C=chromSize
123... S E (pos strand coords)
s = C - E + 1
e = C - S + 1
So in these coordinates, there is usually some extra +1 or -1 that is needed
in coordinate calculations.