The Data Spec(ification)
Ryan Shriver, and the other folks noted in the introductory comments, below, were the principal architects of the DeadLists Data Spec. This document lays out the form in which setlists will be recorded for the DeadLists Project. Comments, additions, questions, praise, and corrections, should be posted to the DeadLists mailing list (and Ryan should be cc-ed). Please note that this document is being put into action, via the scripts and checks used by Ben Brewer's pages and via Tim Buller's database. Rather than relying solely on what you see here, visit Ben's pages first, to see where this data spec comes into play. In fact, in places where this spec and Ben's pages contradict one another, please consider Ben's pages to be more definitive. Why? Ben's codifications of these specifications conform with ongoing work -- and dovetail in exacting manners with the database efforts put together by Tim and Kevin Weil.
Version 1.3.2
Last update was on April 25, 1997
Introduction
This data specification guide provides a framework that setlist
data, used for the Deadlists project, shall adhere to. The specification
was created to aid both programmers and setlist contributors in their
efforts to create a searchable interface for Grateful Dead setlists.
Setlist contributors should read this document to understand the
format setlist data should take.
Contributors to this document
* Steve Zimmerman (saz@well.com) created the first version of this
document.
* Nathan Wolfson (nathan@well.com) updated Steve's version 1.0
using
input from the DeadLists Mailing List.
* Allen Baum (baum@apple.com) updated Nathan's version 1.1
in order
to get his two cents in & make the spec a bit more
rigorous.
* Ryan Shriver has tried to put a final polishing
on this document based on more discussions on the mailing
list.
Changes from version 1.3.1 to 1.3.2
- Dates must be an 8 character string
of the form MM/DD/YY (ie 02/25/77).
- Changed syntax of Song Modifiers.
Added ' Tuning'. Removed '-continued'.
- Removed the : song delimiter (because
it's also used for timings). All others are ok.
- Modified timing maps so they must
appear in the COMMENTS field and not in the setlist fields.
- Set timings appear at the beginning
of a set (enclosed in brackets) and use the ; delimiter.
- Removed literal comments. There
are only reference comments, which are numbered 1 - 99 and
enclosed in parenthesis
in the setlists fields.
- Order of reference comments and
song timings don't matter.
- Simplified the COMMENTS spec to
allow the use of common sense :-)
- Simplified the CONTRIBUTORS field
to adjust for the common working consensus.
- Updated example of the record of
a show.
- Did not attempt to modify the BNF
notation of data spec.
Changes from version 1.21 to 1.3.1:
- Cleaned up the appearence of the
document while adding some general information.
- Revised the format for data.
- Listed the US and Canadian abbreviations.
- Record separator is now a blank
line. See the example of a record below.
- Initial field names (BAND, DATE,
etc.) are mandatory for every record.
- Tabs must separate field names
from field data. Tabs are not to be used anywhere
else in the record.
Changes from version 1.2 to version 1.21:
- An asterisk is used to separate
sub-subfields, e.g comment items in a comment
(not new), and timing
items in a timing-map.
- Many minor readjustments of the
technical definition for clarity.
- Timings and timing maps are allowed
inside the extended-silence separator ';;'
- Shows (records) are separated by
, which is generally a character.
- New definition of to eliminate
ambiguities that would
made parsing rather difficult:
email address now in angle brackets.
Format
For the purpose of this document, each Grateful Dead concert from 1965 to
1995 shall be called a "show". For each show, there is certain historical
data
that should be captured in an organized manner. A "record" is the
structure that
will hold the historical data for each show. For every show, there is
one and only one record.
A record consists of 13 lines. The first line is a blank line,
identified solely by a carriage return (or NEWLINE character).
The next 12 lines consist of field names and their corresponding data.
The field names are (in order):
BAND
VENUE
CITY
STATE
DATE
SET1
SET2
SET3
ENCORE
COMMENTS
RECORDING
CONTRIBUTORS
Each of the above field names are hyperlinks to their respective sections in
this document.
Field names and field data will be separated by one TAB character.
Each field shall be separated by a NEWLINE (carriage return) character,
which is placed at the end of the field data. The format is:
FIELD_NAME<tab>Field Data<carriage_return>
Field Names
Field names are used to define what data should be included in each
field. Please pay close attention to the format of the field data.
BAND
Grateful Dead
VENUE
The venue where the concert took place. When concerts took place on college
campuses, it is important to list the entire name of the school. For example,
UCLA should be University of California, Los Angeles. When the venue is
unknown, a question mark should be entered in the venue field.
CITY
The city where the concert took place.
STATE
The two letter abbreviation (listed below) of the state (or Canadian Province)
where the concert took place. When the concert took place outside the
United States, use the full name of the country where it took place.
United States Abbreviations
AK Alaska
AL Alabama
AR Arkansas
AZ Arizona
CA California
CO Colorado
CT Connecticut
DC District of Columbia
DE Delaware
FL Florida
GA Georgia
HI Hawaii
IA Iowa
ID Idaho
IL Illinois
IN Indiana
KS Kansas
KY Kentucky
LA Louisiana
ME Maine
MA Massachusetts
MD Maryland
MI Michigan
MN Minnesota
MO Missouri
MS Mississippi
MT Montana
NC North Carolina
ND North Dakota
NE Nebraska
NH New Hampshire
NJ New Jersey
NM New Mexico
NV Nevada
NY New York
OH Ohio
OK Oklahoma
OR Oregon
PA Pennsylvania
PR Puerto Rico
RI Rhode Island
SC South Carolina
SD South Dakota
TN Tennessee
TX Texas
UT Utah
VA Virginia
VT Vermont
WA Washington
WI Wisconsin
WV West Virginia
WY Wyoming
Canadian Abbreviations
AB Alberta
BC British Columbia
LB Labrador
MB Manitoba
NB New Brunswick
NF Newfoundland
NS Nova Scotia
NT Northwest Territories
ON Ontario
PE Prince Edward Island
PQ Quebec
SK Saskatchewan
YT Yukon Territory
DATE
A Date is an 8 character string of the form mm/dd/yy (e.g. "02/04/88").
In the event of uncertain dates, unknown elements will be replace by question
marks
(e.g.
"Some day in Feb. of 1984" shall be
"02/??/84" and
"some day in 1969"
shall be "??/??/69" and
"no clue whatsoever"
shall be "??/??/??" ).
Following the date are flags to indicate other information about a
given show: "a" = Early Show, "b" = Late Show, "@"
= Opening Sets in which
band member(s) participated, "+" = Sound Check
For example, an early show opener on February 4, 1992 shall be
designated as 02/04/92a@.
In the event that a precise date or show is stated, but its accuracy
is questioned, a trailing question mark following the date and flags will
be added, (e.g. 04/01/68a?) .
Note: Using the supplemental flags above will allow use of only three set list
fields per show record and avoid a number of
problems associated with
having upward of seven sets per show record.
Otherwise, there would be
some problems such as:
* Simple queries (e.g. finding all times a particular
song was
played in a particular set) will not work, because
there will no way to
determine which of the "Sets" is the
first GD set vs. an opening set.
* When data is imported into a database application
for manipulation,
the file size would be significantly larger than
if the flagging method
was used, because sufficient room in each of
the seven? set list fields
would have to be allocated to handle a full set
list, when in reality,
relatively few of the fields above SET THREE
would ever be used.
SET1, SET2, SET3, ENCORE
Songs shall be named according to the mutually agreed upon list at
http://www.deadlists.com/ -> Resources -> Song Names.
New song names, jam names, tuning names, and so on, will be added as needed.
Song Modifiers
There will be standard song modifiers to note special occurrences
that we want to be able to filter or sort differently from other regular
song names.
These are:
"(song-name) Jam" specifies a recognizable
insertion of a known song
that doesn't actually become that song.
"(song-name) Reprise" specifies the
return to a song's conclusion,
whether or not begun during that show.
"(song-name) Tease" specifies the hint
of a song that then becomes
something else.
"(song-name) Tuning" specifies the
the initial notes of a song are
played as tuning warm-up.
The use of the song name 'Jam' refers to an unspecified
instrumental
tune. If and when it gets classified, it will
become " Jam"
Song Delimiters
Note: All song delimiters should be preceded and trailed by one blank space!!
';' Songs
not continuing into each other are separated by a
semicolon (e.g. (song1) ; (song2)).
'>' Songs
that segue into each other by means of a defined jam or
contiguous transition -- a guzinta -- are separated by the
character '>' (e.g. (song1) > (song2)).
'~' Songs
that segue into each other, not by means of a defined
transition, but through an intentional pause are separated by a tilde
(e.g. (song1) ~ (song2)).
'; ;' In the event that
two songs are separated by a pause of greater
than 60 seconds, a separator of two consecutive semicolons is used
(e.g. (song1) ;; (song2).)
It's possible to insert the length of the pause between the semicolons
(e.g. (song1); [3:20] ; (song2)) using the standard timing format.
'%' Where
a tape splice, flip or pause prevents examination of the
space between songs, a backslash will be used (e.g. (song1)\ (song2)).
Song and Set Timings
Song timings, where known, are enclosed in square brackets directly
following the item being timed, (ie [2:33] ; ).
Set timing, where known, comes at the beginning of the setlist field and
is separated from the first song by the ; character.
(ie SET1[58:38] ; ; )
Timing maps are collections of timings describing the substructure of an song.
They are placed in the COMMENTS field and enclosed with the {} brackets. There
should be a reference mark next to the timing-map song in the setlist field.
(ie COMMENTS (1) {intro: [1:20] verses1&2 [3:20] instrumental
chorus
[1:55] verse 3 & fadeout [3:22] var of theme [2:43] jazzy jam [5:58]} )
COMMENTS
Comments are usually notes such as guest artists on songs, acoustic
versions, and any other information that may be useful but not
discernible from the setlist information alone.
Reference marks are numbered references to comments contained in the
COMMENTS field. The numbers (from 1 through 99) are placed in parenthesis
following the item being commented on. For each numbered reference mark,
there must be an associated reference mark in the COMMENTS field, followed
by the corresponding comment. For example:
SET1 Little Red Rooster (1) [4:45] ; Bertha ;
COMMENTS (1) with Carlos Santana.
There is no requirement that an item have only one reference mark,
nor that more than one item can share a reference mark. Thus, a song
might have two reference marks, one referring to guests, and one to
acoustic, or two songs might refer to the same guest artist comment.
If a timing is present, the order of the timing and the reference mark
does not matter. For example, all of the following are valid:
SET1 Little Red Rooster (1) [4:45] ; Bertha (1) ;
SET1 Little Red Rooster [4:45] (1) ;
SET1 Little Red Rooster [4:45] (1) (2) ;
SET1 Little Red Rooster (1) (2) [4:45] ;
SET1 Little Red Rooster (1) [4:45] (2) ;
COMMENTS (1) with Carlos Santana.
(2) Bob on acoustic.
All of the descriptions in the COMMENTS field except the first are required
to be labeled with a reference mark. The first descriptions in the COMMENTS
field, if unmarked, refer to the entire show and/or set, depending.
If part of the show was acoustic, then an appropriate entry would be:
COMMENTS First set was acoustic, second
set was electric.
RECORDING
This category will be used to track the kinds circulating recordings of
each performance using standard source abbreviations. An example of the
kind of information this might include (focused on two early years) can be
found at http://www.winternet.com/~edoherty.
For later years, specifications such as
the various kinds of audience tapes available for a particular show and
what kind of soundboard tapes exist will be included (though we might want
to include some default assumptions regarding the availability of at least
analog audience tape from the taper's pit of Dead shows for every show
since the section was created in '84.
Proposed abbreviations:
* format:
o C=cassette Cassettes
and open reel can list an equalization
o R=open reel or compression
used, tape type, and speed such as:
o P=PCM
o A=Dolby A o MO=metal oxide o 75=7 1/2 ips
o D=DAT
o B=Dolby B o CH=chromium o
35=3 3/4 ips
o B=HiFiBeta
o C=Dolby C
o V=HiFiVHS
o D= DBx
* source:
o SB=SBD patch
o MS=matrix SBD recording
o FM=FM broadcast
o PF=pre-broadcast FM recording
o A?=audience (unknown location)
o AF=audience Front of Sbd Audience
o TS=audience taper's section
* generation:
o M=master
o 1-9 = generation number (master=0 gen)
CONTRIBUTORS
The source of each set list will be noted with names and email addresses of
the
people that submitted it. Some dates will have many contributors. These will
be deleted with time as the consensus emerges that "this list is correct".
Comments about how the contributor received the setlist should be noted in
the COMMENTS section. An example entry would be:
CONTRIBUTORS Gordon Sharpless (paleo550@philly.infi.net),
Jeff Tiedrich (jeff@tiedrich.com)
Example of a record of a show (Data is totally bogus):
BAND Grateful Dead
VENUE Oakland Coliseum
CITY Oakland
STATE CA
DATE 10/17/91a
SET1 [59:26] ; Touch Of Gray [5:04] ~ Little Red Rooster (1)
; Lazy River Road ; When I Paint My Masterpiece (1) ; Childhood's End [5:32]
; Cumberland Blues > Promised Land (2)
SET2 [45:23] ; Shakedown Street [10:34] ; Samson And Delilah
; So Many Roads ; Playing In The Band > Corinna > Drums > Space [30:01]
(3) > Playing In The Band Reprise % Sugar Magnolia [6:56]
SET3
ENCORE Black Muddy River ; Box Of Rain
COMMENTS Dave Matthew's Band opened.
(1) Bobby on acoustic. (2) with Carlos Santana. (3) {noodling [5:26] *teasing
[9:32]* real jam [5:03] }
RECORDING 180 SBD
CONTRIBUTORS Gordon Sharpless (paleo550@philly.infi.net),
Jeff Tiedrich (jeff@tiedrich.com)
<blank line>
DeadLists Formatting
Specification Proposal Formal Description
Version 1.21
Last update was 5/29/96.
Notation
An item inside angle brackets ( "<..>" ) is a type of an object.
The type is either a literal string, or constructed
of other objects.
Items within curly brackets ( "{..}" ) are items that must occur
together.
Normally this is to delimit a range, or set of
repeated items.
Items inside square brackets ( "[..]" ) are optional.
Items separated by a dash ( "-" ) indicate a selection range.
The first item indicates the lowest value of
the range, and the second
item indicated the highest.
Items separated by a vertical bar ( "|" ) indicate selection items;
exactly one of the items is to be selected. If
inside square brackets
then none of the items is also valid.
Items followed by an asterisk ( "*" ) can be repeated
zero or more times.
To force something to be repeated on or more
times, use <item> {<item>}*
Items inside single quotes ( "'..'" ) are literals. These can be used
to
insert characters that would otherwise be interpreted
as formatting
characters, such as brackets. To insert a single
quote, use two
consecutive single quotes.
Newlines and whitespace inside item definitions are for formatting only,
and are NOT part of the field definition; Use
<crlf> if a newline is
required, <ws> if spaces are required,
and <ws?> if whitespace is
permitted. The amount of whitesspace is irrelevant.
Several objects are predefined literals:
<crlf> is a newline (either a control-M
or control-J
or the character pair control-M,control-J.
<newpage> is control-L.
<tab> is a tab character, control-I.
<space> is a space character
<ws> is (<space> [ <space>
]*
<ws?> is
[ <space> ]*
---------------------------------------------------------
A <show> is <header> <band-field> <date-field>
<venue-field> <city-field>
<state-field>
<set-field> <set-field> <set-field>
<encore-field>
<comments-field>
<recording-field> <contrib-field>
A <header> is BAND <tab> DATE <tab> VENUE <tab>
CITY <tab> STATE<tab>
SET1 <tab> SET2 <tab> SET3 <tab> ENCORE <tab>
COMMENTS<tab>
RECORDING<tab>CONTRIBUTORS<crlf>
A <band-field> is (generally): "Grateful Dead"<crlf>
A <date-field> is <month>'/'<day>'/'<year>['?'][<show>]<crlf>
<month> is { 01-12 } | {1-12 } | '??'
<day> is { 01-31 } | {1-31 } | '??'
<year> is { [ 19-20 ] {00-99}}| '??'
<show> is [ 'a' | 'b' ] [ '@' | '+'] [ '?' ]
Much of this is just a complicated way to say
that numbers below 10
can have an optional leading zero for day and
month.
Note that years without leading 19 or 20 assume
19.
For <show>, 'a' indicates early show, 'b'
indicates late show,
'@' indicates opening act with band members,
'+' indicates soundcheck.
****there is some ambiguity here, having to do
with '?', especially
if there is no show characters, but there is
a show '?', etc.****
A <venue-field> is <name of venue> <crlf>
A <city-field> is <name of city > <crlf>
A <state-field> is { <name of state> | <name of country> }
<crlf>
Using the two letter postal abbreviation for
state, or two letter
internet
domain name for non-US countries is preferred.
A <set-field> is
[ <comment-info> ]
[ <song-name>[<ws><comment-info> ]
[ <song-sep>
[ <song-name>[<ws><comment-info> ]] ]* ]
The initial <comment-info>
refers to the set; subsequent
<comment-info>s
have exactly the same format, and refer to the songs
they immediately follow.
An acoustic set, for example, might be
prefixed with the comment
keyword "Acoustic:". Alternately, it might
have a comment-reference
"(1)" which would refer to a comment
"(1) Acoustic: ".
Note that <set-field> is used for SET1, SET2, and
ENCORE, and can be ntirely
blank.
A <comment-info> is [<timing>] [<timingmap>]
[ <ref-mark> | { '(' <lit-ref> ')' } ]*
A <timing> is '['
[[<hr> ':' ]<minsec>] ':' <minsec> ']' <ws?>
A <hr> is { 0-24 | 00-24}
A <minsec) is {00-59 | 00-24}
A <timingmap> is
'{' <phase-timing> [ '*' <phrase-timing>]* '}'
A <phrase-timing> is <descrip><timing><ws?>
A <ref-mark> is
'(' <1-99> ')' <ws?>
A <lit-ref>
is [<keyword>':'] <descrip>
[ '*' [<keyword>':'] <descrip> ]*
Note that <descrip> is arbitrary text that doesn't start
with a digit, or contain '(', ')', '{', '}', '[', or ']'.
Valid <keyword>s include "Acoustic:", "Guest:", "Opener:".
A complete list can be found in ***insert URL here***.
A <song-name> is <name of a song>['-'<song-mod>]
This list of valid <song-name>s
can be found in
***insert URL here***
A <song-mod> is
'-jam' | '-continued' | '-reprise' | '-tease'
A <song-sep> is <ws> { ':' | ';'[[<timing][<timing-map>]';']
| '>' | '~' | '\' } <tab>
Note that <song-titles>
better not contain the characters ":;>~\".
This definition implies:
-Anything inside curly brackets is a timing
map.
-Anything inside square brackets is a timing
(even inside curly brackets.)
-Anything inside parens is a literal comment
or reference mark.
Thus, we can find <song-name> by filtering
out everything inside one of
these kinds of brackets.
The end of a <song-name> is a <sep-char>
(or <newline>).
A <comments-field) is [ <comment-subfield>
<tab>]
[<ref-mark> <comment-subfield>]
[<tab> <ref-mark>
<comment-subfield>]*
A <comment-subfield> is [<timing>] [<timing-map]
[<lit-ref>]
The initial <comment-subfield>s refers
to the show.
The rest must have <ref-marks>s
that match <ref-marks>s in <set-fields>
The <comment-subfield> definition is very
similar to <comment-info>,
except: <lit-refs> are not in parens and
the <ref-marks> are separated.
The field definition is similar to a <set-field>,
with <tab> instead of
<song-sep> and <ref-mark> instead
of <song-name>, so:
-Anything inside curly brackets is a timing
map.
-Anything inside square brackets is a timing
(even inside curly brackets.
-Anything inside parens is a reference mark.
Thus, we can find a comment by filtering out
everything inside one of
these kinds of brackets. Multiple items in a
comment are separated by
an asterisk, as usual
A <recording-field> is [<fmt> [<eq>]] [<src>]
[<gen>]
A <fmt> is {'C'<tape>} | {'R'<speed><tape>}
| 'P' | 'D' | 'B' | 'V'
(C=cassette, R=open reel, P=PCM, D=DAT, B=BetaHiFi, V=VHSHiFi)
A <tape> is 'MO'
| 'CH' | ********list others here!*******
(MO is metal oxide, CH is chromium, etc....)
A <speed> is [
'7.50' | '3.75' ]
A <eq> is 'A' | 'B' | 'C' | 'D'
(A=dolby A, B=Dolby B, C=Dolby C, D= Dbx)
A <src> is 'SB' | 'MS' | 'FM' | 'PF | 'A?'
| 'AF' | 'TS'
(SB=SBD patch, MS=matrix SBD recording, FM=FM broadcast,
PF=pre-broadcast FM recording, A?=audience unknown location,
AF=Front of Sbd Audience, TS=taper's section)
A <gen> is 'M' | ('1'-'9')
A <contrib-field> is
[[ '<' <email-addr> '>'] ['('<name>')']
<verif-code><ws?><lit-ref><tab>]*
A <verif-code> is 'V:' | 'A:' | 'O:'
]
(V=verified against
tape, A=assumed, O=other, described in <descrip>)
This definition implies:
-Anything inside angle brackets is an email
address.
-Anything inside parenthesis
is a name