Click here to go to the forum index Click here for the home page
 
Author Message

<  MyStuff  ~  Proposed update to MEI Format

Page 2 of 4
Goto page Previous  1, 2, 3, 4  Next
nwhitfield
Posted: Fri Jan 13, 2006 5:39 pm Reply with quote
Site Admin Joined: 20 Mar 2005 Posts: 9577
It did occur to me a while back that porting SQLite to the Toppy would be an interesting project, and open up some novel avenues for handling EPG data, especially with regard to TV Anytime.

Nigel.

_________________
Support this site - make a donation to our running costs
View user's profile Send private message Visit poster's website
wooders
Posted: Fri Jan 13, 2006 6:18 pm Reply with quote
Frequent contributor Joined: 25 Aug 2005 Posts: 428 Location: The National Forest
Mike wrote:
Now the ideal would be a full ANSI complaint SQL implemenatation on the Toppy, with distributed database support via the USB to a Windows or Linux implementation thus allowing full integration of any data on the Toppy with data sources of the users choosing. Think of the possibilities!

Isn't the ideal a really good source of EIT data that has all the information that we all want in it. Then we don't actually have to DO anything.

_________________
TF5800, TS On, F/W: MS6 Recommended F/W 12/9/2009 -Sy+Pe
TAPs: EPG2MEI v0.96; Extend v1.7; Font Manager 1.0d; MyInfo B5.5; MyStuff 6.4; WSSkiller V2.12d; SecCache (UK) v0.4; EIT Sub (Game) v0.6; MHEG On/Off A3;
Sig generated by MyInfo on 30/3/12


Let rt2mei 1.1a feed your Toppy with EPG data from the Radio Times website.
Download rt2mei from http://www.wooders.co.uk/rt2mei
View user's profile Send private message Visit poster's website
wooders
Posted: Fri Jan 13, 2006 6:26 pm Reply with quote
Frequent contributor Joined: 25 Aug 2005 Posts: 428 Location: The National Forest
My thoughts on file format, as promised earlier.

A few cursory thoughts to throw into the melting pot, which should not be too onerous for the 'consumers' (BobD and rwg) to accomodate:
  1. Lines beginning with // are comments
  2. Blank lines are ignored
  3. Lines of the form $<name> = <value> provide a way of defining certain global values. For example
    $version = 1.1 // and why not allow a comment here
    $generator = rt2mei
    $timestamp = 20060112
    Consumers not wishing to process these lines can treat them as comments.
  4. Format for actors could be
    <character>:<actor>;<character>:<actor>; ...
    Should director be a character or a separate field? I can think of arguments pro and con.
  5. More thought about including radio data required and where to get it. Or should radio be a separate file and format? Why not?
  6. Do we need any other fields?
Wooders

_________________
TF5800, TS On, F/W: MS6 Recommended F/W 12/9/2009 -Sy+Pe
TAPs: EPG2MEI v0.96; Extend v1.7; Font Manager 1.0d; MyInfo B5.5; MyStuff 6.4; WSSkiller V2.12d; SecCache (UK) v0.4; EIT Sub (Game) v0.6; MHEG On/Off A3;
Sig generated by MyInfo on 30/3/12


Let rt2mei 1.1a feed your Toppy with EPG data from the Radio Times website.
Download rt2mei from http://www.wooders.co.uk/rt2mei
View user's profile Send private message Visit poster's website
charley
Posted: Sat Jan 14, 2006 2:02 pm Reply with quote
Frequent contributor Joined: 13 Jul 2005 Posts: 1563 Location: Belfast
If the file size grows would it be worth agreeing an mechanism which allowed a TAP to check when it starts to determine if the data file has already been loaded and use that or would the problems of what happens if the loading TAP is exited or if a TAP reloads the data to refresh after a new file upload be too great. Would a TAP need to check for file integrety each time it accessed the file and if it could not find it load it?
I suppose one could have a "server" TAP to load the file and other TAP authors could choose to use or ignore it.

_________________
regards Charley
Toppy: TF5800PVR250GB &300GB;Firmware: ; Remote: Harmony 655; Tx:Divis; Autostart TAPs: MEISearch 1.35, MyStuff 6, AutoReboot V2.2, epg2mei, Power Manager v2.0, TAP Commander 1.2, TF5000 Display v1.50, MHEG_State; Other:
View user's profile Send private message
rwg
Posted: Sat Jan 14, 2006 2:18 pm Reply with quote
TAP author Joined: 29 Oct 2005 Posts: 604 Location: Oxfordshire
Only just found this thread - too busy putting out MeiSearch 0.2 yesterday evening to follow the board.

My 2p worth...

Go option 2 - the data overhead in memory is roughly the same for the 2 options and it's always easier to merge extra fields into the description if in the TAP code if you don't feel like handling them properly than to try to parse them out of a long string.

I *think* my MEI reading code is robust against extra/missing fields on each line - it would certainly be easy to fix if it isn't, and dealing with extensions like // comments and $ values is easy (to start with just discard any lines starting with those Smile ). I'm also intending to put this part of the code into the public domain in the fairly near future - C++ classes covering EPG data, channels, events, timers, reading MEI data etc.

The actor specification with : and ; is good - it's human readable if the TAP doesn't bother parsing it, but can easily be parsed out for searching on actors. Put a space after each ; to improve readability.

If someone starts making EMEI files (extended MEI?) bung one my way and I'll see if it works.

Robin

_________________
Toppy: TF5800PVR; Firmware: 5.13.65 + patches + aXel; Remote: Pronto RU940; Autostart TAPs: MyStuff 6.5 and friends
View user's profile Send private message Visit poster's website
LordCake
Posted: Sat Jan 14, 2006 4:57 pm Reply with quote
Frequent contributor Joined: 03 Jul 2005 Posts: 217 Location: Manchester
Yet another vote for option 2. Bite the bullet and get it right now.

As others have mentioned:
1) consider some "spare" undefined fields for future expansion
2) if anything needs adding to allow use for radio stations, add this now

and... keep it as simple and as general as possible.

Regarding name change:
Don't really care except that it might add some confusion if names of TAPs & file formats start changing.... so why not MEIv2 where you need to make it clear the new format file is being used (eg: in documentation).

_________________
Model: TF5800PVR F/ware: 5.13.65EfNfCyXpXwSXlUUuHPTCeGmSrUxEsRs Xmitter: Winter Hill Q: ~100% S: 76-95% Aerial: Group C/D bandpass filter Taps: MyStuff v4.54d, RemoteExtender v1.5, deselect v1.0Connected: Toppy<->undeclocked debianSLUG + iguanaIR running: ftpd-topfield, rt2mei, bleb2tie & lirc
EPG data for radio channels: http://my.opera.com/bleb2tie/
View user's profile Send private message
bdb
Posted: Sat Jan 14, 2006 7:33 pm Reply with quote
Frequent contributor Joined: 18 Oct 2005 Posts: 499
Bawbagg wrote:
Let the debate begin.


I don't think that the format is too bad as it is; but it does need more precisely defining and needs some changes/enhancements. Now seems to be a good time do it before even more reliance gets placed on it.

name?
topfield - there is nothing topfield spceific about this format, but obviously aimed at this platform ...
event/epg - certainly applicable
external/import/export/info - but not externally sourced, the format should be useful for importing to/exporting from the toppy internal database as well as a taps private database.

I like the play on eit - tei wins for me
or if the format remains very similar; then stick with mei

format?
- human readable (ascii)
- text editor friendly (no control characters)
- both generator and especially receiver friendly
-- (well defined rules for field separators, newlines etc)
-- makes parsing simpler and less error prone
-- rules may allow some tolerance to badly formatted files
-- e.g. dos/unix line ending, spurious <cr>/<lf>, spurious spaces, tabs, case sensitivity etc
- needs better field definitions

documentation: essential
does not need to be pretty or long winded; concise and precise is best, but must be freely available.
Is the source for rt2mei.php the current best effort?

missing fields from:
* credits
* language
* orig-language
* icon
* url
* country
* audio

+ event_id
+ service_id
+ transport_stream_id
+ original_network_id

LCN field: bad.
LCNs are user box specific, which makes current mei files non-portable. My box also has some duplicates - both CBBC and ITV4 are on LCN 30... Radio LCNs duplicate TV LCNS.

service_id: essential
these also allow for regional variations. Some people can receive from 2 different muxes. e.g. BBC1 has different service_ids for each region.

event_id: essential
the _only_ way to merge various sources. it is very unhelpful that the rt data does not carry them; the TVA data does ... The internal epg will replace an entry when the event_id matches; but will duplicate it if the event_id differs.

transport_stream_id: good for completeness, essential for satellites
original_network_id: good for completeness, essential for satellites

end_time / duration: one but not both
having both can only lead to trouble. If they match, then you clearly only need one. If they differ, then what do you do?
-my preference is for duration, and drop the end date/time.

comment field: good
but not //, if main delimter is single character |, then so should any other delimters
how about a leading # or ;

system field: good
e.g. $timestamp = 20060112
- limit to 1 per line; but watch the spaces around the =
easy to require a generator not to include them, harder to have a parser check for it

keeping fields separate: good
merging fields: bad
if a display tap wants to ignore a field - easy
if a display tap wants to classify based on a field - easy
if a display tap wants to merge everything in 1 box for displaying - fine
if it wants them separate, but is suplied them merged, it is a lot of work to 'unmerge' the fields.

Format for actors could be:
<character>:<actor>;<character>:<actor>; ...
- radio times uses
<character>*<actor>|<character>*<actor>
- ouch! here's a big trap to fall down ...

field length:
Since the data is mainly intended for used by toppy, the length of the description fields causes problems - it makes dynamic memory management necessary to avoid pre allocating huge buffers
and fields may need truncating / splitting for display purposes. The internal epg cannot handle more than 256 characters, so a long film description may get rudely chopped.

TVA has a nice concept of potentially offering a short (90 char), medium (180 char) and/or long (1200 char) descriptions.

flags fields:
these are currently very vague. Are they boolean or multistate?
if boolean, why not true/false or 0/1
if multistate what are the states
e.g. does a blank widescreen field mean 4:3 or does it mean unspecified?
what about case sensitiviy?
does Premiere == premiere == PREMIERE ?


Generally, I think for this type of format it is best to have very strict rules for the generator, and concentrate on keeping the parser very lightweight.
e.g.
- onus on generator to produce correct files
- no spurious spaces/tabs
- rigid end of line definition
- controlled list of valid flag fields
etc

genre:
TVA goes slightly mad when it comes to genre and offers multiple genres per program. Is there any control over the categories? - here there seems to be both 'Soap' and 'Soap Opera'...

'Neighbours' gets:
Soap Opera
Fictional portrayal of life
ENTERTAINMENT

'Doctors' gets:
Soap
Fictional portrayal of life
ENTERTAINMENT
Medical melodrama
Heart-warming
Gritty
General light drama
Soap Opera
REPRESENTATION/PLAY

'Murder, She Wrote' gets:
General light drama
Fictional portrayal of life
ENTERTAINMENT
Detective
Gutsy
Heart-warming
REPRESENTATION/PLAY
etc
... sort that lot out


extendability: essential
but trcky, how do you include support for something you've not yet thought of

could just add more fields to each line

would be nice to reorder the existing fields, putting the essential (required fields) first, optional fields last.

could have a some sub category delimiters
e.g for films there may be a set of categories that are not present for other programming.
for a film: ...|film[year|certificate|rating|director|cast]|...
for other: ...|soap|...

could have private sections etc
etc

Keep the debate rolling ...

bdb
View user's profile Send private message
DB1
Posted: Sat Jan 14, 2006 8:40 pm Reply with quote
Frequent contributor Joined: 30 Mar 2005 Posts: 728 Location: Orpington
Some comments in the original perl code for xmltv that it all started from indicate that the RadioTimes duration can be iffy.

_________________
TF5800, F/W: 5.13.65AbBfBqC0CbCeCkCwCyDEcEeEfErEsEvEzFFsGmHHeIKtNfOtPPcPePsRRaRhRpRsSSdSrStT2TdTfTpUUuXXpXwXl TAPs: TF5000 Display v1.53; QuickJump 1.72; Power Manager v2.2; MyInfo B5.6; DescriptionExtender 2.23; Remote Extender 1.6; Archivev1.0a; mei2archive BETA 3.8l7; EPGnavigator v5.1c; UK Auto Scheduler v0.73.1; Extend v1.7; Power Restore V0.7.8
Tx: CP
Sig generated by MyInfo on 14/4/13
View user's profile Send private message
rwg
Posted: Sat Jan 14, 2006 9:28 pm Reply with quote
TAP author Joined: 29 Oct 2005 Posts: 604 Location: Oxfordshire
why do I feel this may be getting out of hand? I'd go with keeping the changes to a minimum by putting more fields on the end of each line. Re-ordering is going to cause confusion, as is removal.

It all gets complicated if there is any other use for the | character than top level field separators - the parser goes from a simple loop+switch statement into something more complex and error prone.

Robin

_________________
Toppy: TF5800PVR; Firmware: 5.13.65 + patches + aXel; Remote: Pronto RU940; Autostart TAPs: MyStuff 6.5 and friends
View user's profile Send private message Visit poster's website
nwhitfield
Posted: Sun Jan 15, 2006 12:17 am Reply with quote
Site Admin Joined: 20 Mar 2005 Posts: 9577
Indeed.

You have to keep it simple - and ideally backwards compatible too, since not everyone will remember to upgrade everything at the same time.

MIME does this nicely, where you look at a mail message in a non-compliant reader, and there's an explanatory text.

Perhaps a future revision of the format could use the first entry to indicate what version it is - and do so in such a way that it also creates a valid EPG entry that people may spot, say 10pm on Monday, BBC1 that says

NEWS: BBC News. Please upgrade your EPG software; for more details check www....

Adding extra fields just because you might need space later seems silly; keep what you have in the same order.

Perhaps if extra fields have to be added, they could appear with a different field separator, inside a current field, like the long description; human-readable, they'll still be accessible to people with old versions as well.

It would be great if you could overhaul things dramatically, to overcome shortcomings in version 1, but there are quite likely people out there who bought a Toppy, came along here saying "I don't like the EPG" and were told "Try MyStuff/MEI" and won't be paying much attention until the next time their box crashes and they need help. They're not going to read all the details, but they might grab an updated zip file if they see one, and you don't want them to get confused.

Nigel.

_________________
Support this site - make a donation to our running costs
View user's profile Send private message Visit poster's website
bdb
Posted: Sun Jan 15, 2006 1:04 am Reply with quote
Frequent contributor Joined: 18 Oct 2005 Posts: 499
and there's the rub

The only reason that consideration is being given to make any changes at all is that the current format is inadequate to handle some applications.

If you think you can propose a backwards compatible format, that does not add more complexity please go ahead. (I suspect not possible, since the current format is not actually defined).

I agree that it is pointless to try to support features that you have not yet thought of. Use xml for this.

I'm also against having multiple versions in existance at a time - incompatible versions are really different file formats completely. so unless agreement is made, we should stick with a very simple
<field>|<field>|...|<field><CR>
where the field ordering and definition is fixed. However this will only work if there are adequate fields initially defined, and not continually added each month.

What the format needs most of all is a spec. or list of rules, rather than relying on guesswork based on someones implementation.
Are the end of lines 0x0d,0xa, or 0xa, or either?
Is is case sensitve?
What do all the fields mean?
etc.

If it is to be for a wider range of applications than it currently is, or if it is to support data sources from anywhere other than radio times, it does needs a few extra features like service id, event id.

If the data is to be used for anything other than just displaying on the screen, there has to be definition of what values can be in each field.

If not; then leave it alone, and have every user invent their own format to allow their applications to work.

bdb
View user's profile Send private message
LordCake
Posted: Sun Jan 15, 2006 3:19 pm Reply with quote
Frequent contributor Joined: 03 Jul 2005 Posts: 217 Location: Manchester
nwhitfield wrote:
...Adding extra fields just because you might need space later seems silly; keep what you have in the same order. ...


Why is it silly? and why do you think it would change the order of fields?

If the existing format had had some spare undefined fields at the end of each record then the proposed format change would not be necessary (Old versions of applications would work as before, simply ignoring the data in these fields. New versions of applications would utilise the newly defined data). The file would be slightly larger by the amount of a few extra field separators.

Keep it simple, keep it as close to existing format as possible.

The existing format is defined here: http://www.toppy.org.uk/forum/viewtopic.php?t=2115&postdays=0&postorder=asc&start=214

_________________
Model: TF5800PVR F/ware: 5.13.65EfNfCyXpXwSXlUUuHPTCeGmSrUxEsRs Xmitter: Winter Hill Q: ~100% S: 76-95% Aerial: Group C/D bandpass filter Taps: MyStuff v4.54d, RemoteExtender v1.5, deselect v1.0Connected: Toppy<->undeclocked debianSLUG + iguanaIR running: ftpd-topfield, rt2mei, bleb2tie & lirc
EPG data for radio channels: http://my.opera.com/bleb2tie/
View user's profile Send private message
Bawbagg
Posted: Sun Jan 15, 2006 6:35 pm Reply with quote
MyStuff Team Joined: 11 Aug 2005 Posts: 1122
I'm blown over by the interest this has generated Very Happy.

There have been lots of great suggestions since I started the thread, and also the arrival of (yet) another TAP to use the MEI format (DX's mei2eit - what a fantastic idea!). All together, I think this cements the presence of the format.

Now, having read through the posts I'll try to summarise things as I see them:

1. Change/extension to the format is GOOD.
=> This WILL happen

2. Change to the name will therefore also happen
=> I *really* like Nigel's take on it, therefore new name will be Toppy Imported EPG, and files will have the .tie extension.
=> Default location will be /ProgramFiles/
=> Need a default name - I suggest MyEPG.tei

3. Upwards compatibility
=> Existing columns will REMAIN in existing order
=> Additional data fields will be tagged on the end

4. Comments and system fields will be added
=> Lines starting with "#" will be comments
=> Lines starting with "$" will be system fields with the format $field=value (NO SPACES)
=> System fields will be $timestamp and $generator (I don't think we should need or have version numbers)

5. Next field to be added is prog_credits
=> Optional delimiting within the field to be <role>:<person>;<role>:<person>
=> If no delimiting in the field, then a simple list of roles/people is acceptable

I'm yet to be convinced that we should add ServiceID to the file. If we do, where do we get the list of active ServiceIDs - they're not presented in the RadioTimes files (and will therefore need cross-matching). I think this will make file generation much more complex for the normal user - and could result in MANY questions from new users.

If someone can explain how ServiceID would work, then I'm open for discussion on swapping the field where we store LCN for ServiceID.

This is intended to be a stake in the ground. Have I missed anything?? If not, then I'll knock up a slightly more detailed document on the whole format, and maybe Nigel could post it in the Guides and Docs section of toppy.org.uk??

Cheers,

BB

_________________
TAPs: MyStuff Something or other + whatever CW recommends
MEI readme and latest version at http://my.opera.com/bawbagg
Current MyStuff Known Bugs http://www.BobDsMyStuff.co.uk/Bugs.shtml
View user's profile Send private message Visit poster's website
nwhitfield
Posted: Sun Jan 15, 2006 7:02 pm Reply with quote
Site Admin Joined: 20 Mar 2005 Posts: 9577
Service IDs are useful for things like satellite users; you can see the service ids using one of the menu options in EPG Uploader; earlier version of it used to rely on them for the configuraiton, which was a nightmare for new users, and one of the reasons I didn't play with it for quite some time.

Newer versions still support service IDs, but allow use of LCNs as well.

While LCNs are all that we can really choose channels by on the TF5800, don't forget that there are satellite versions too, and when the BBC start to promote their own free satellite service, we may find that we have more users who use that version appearing here.

The question then becomes whether or not you want this format to be restricted essentially to Freeview users, or if it should be capable of being used by people regardless of the actual way they're receiving the broadcasts.

As for documentation, I can host that on the site here, no problem. Just send me a PDF when you have one ready.

Nigel.

_________________
Support this site - make a donation to our running costs
View user's profile Send private message Visit poster's website
wooders
Posted: Sun Jan 15, 2006 11:19 pm Reply with quote
Frequent contributor Joined: 25 Aug 2005 Posts: 428 Location: The National Forest
Bawbagg wrote:
=> System fields will be $timestamp and $generator (I don't think we should need or have version numbers)

I believe that it would be better to include a version number now, even if it is never needed in the future. In the event that later on there becomes a need to modify TIE format again then we will already have a version number in place so that the consumer taps can tell the difference between the old and the new format. I'm just trying to look ahead and cover the possibility of future changes.

Another field that springs to mind is:
$source - where the data was grabbed from, e.g. the base url of the RT, Bleb, Digiguide website

Admittedly this isn't vital data but it would probably be useful to include so that the relative quality of the various data sources can be compared.

BB, you haven't explicitly said anything about radio data in your last post, I think that it would be helpful if you would mention whether or not you see this being included in TIE.

_________________
TF5800, TS On, F/W: MS6 Recommended F/W 12/9/2009 -Sy+Pe
TAPs: EPG2MEI v0.96; Extend v1.7; Font Manager 1.0d; MyInfo B5.5; MyStuff 6.4; WSSkiller V2.12d; SecCache (UK) v0.4; EIT Sub (Game) v0.6; MHEG On/Off A3;
Sig generated by MyInfo on 30/3/12


Let rt2mei 1.1a feed your Toppy with EPG data from the Radio Times website.
Download rt2mei from http://www.wooders.co.uk/rt2mei
View user's profile Send private message Visit poster's website

Display posts from previous:  

All times are GMT + 1 Hour
Page 2 of 4
Goto page Previous  1, 2, 3, 4  Next

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum