...
What is the story with this and others like this
FIBO™ Infrastructure Support for Follow Your Nose Internationalized Resource Identifiers (IRIs)/wiki/spaces/FPT/pages/8912930
This document is superseded by https://github.com/edmcouncil/fibo/blob/issue/FLT-65/etc/process/iri-scheme.md
...
Status of Karthik work with Dean
5) For next week.
View file | ||||
---|---|---|---|---|
|
Proceedings:
20170914 FIBO FPT
Linked Data Fragments - fragments.edmcouncil.org OK:Objective was to see what I could do with the data fragments server, to use a protocol to turn on the versioning, as an exercise to see what we can do. Found out 2 things: You can store with versions in what is called a Memento. So, each FIBO version could have a memento associated with it. Would annotative a Quad file or JSON-LD with such a memento. Omar demonstrates the Memento. Supported by the server. See Server.js in GitHub there. Tried this out. Got a couple of versions for this test. Then for a FIBO Publishing job we could have a job that goes out and saves off a version, so that one can query against these back in time using the Memento feature of the fragments server. This works (was able to enable it). Did a config file that needed work. Shows the config file. This is using NQuads. Used the ones in the current spec site. Hard coded a 'valid for' window. Has FIBO and FIBO2 (as proof of concept). Identified some config gotchas.
Questions? PR how does this relate to e StarDog versioning? OK: Abandoned the StarDog versioning. Still has a connection to StarDog. Per version you could probably execute a query on StarDog for versions. But this was abandoned in June in favor of this. PR pros and cons? OK: vendor lock-in. The SD versioning is specific to SD. This is a more standards-based protocol. Sponsored by BNY Mellon (Mellon Foundation). Collaboration between Los Alamos, Lib of Congress and Mellon Foundation. This is an RFC. JG by not using StarDog we are more closely ties to the structure on spec.edmcouncil.org where we have versions for every branch and tag, so we can create a fragment server for each of these versions via the config file. Can generate Config file using a script if we do that.
OK To Do list: The Interval - there is a simple JSON Parse (command line), that you can use to parse the JSON Files. Writes time stamps in ISO 8601 format. OK would process the file, see if the interval from a previous entry already exists, if so concatenate it.
What does the interval look like? OK: Says for how long you intend a given version to be valid. What does the tool do with that? Click on Memento, it says that a given Memento is valid from here to here. Does it disable anything? No. JG: All our versions of FIBO are permanently valid - as long as a user is using a given version it is valid. OK not sure what we would do with that feature. Also has questions for the authors of the thing. Need to know why we would need to assign a validation date.
PR: What is the scope of a fragment e.g. all FIBO or some part? How normally used? OK: Fragment is different format to e.g. a SPARQL query end point return. This is very different. So how many fragments would we have e.g. All FIBO. Domain, Module? OK: Not sure. This is something to decide about. What doe sa URL look like if yo go from FIBO 1 to FIBO 2. Can you generate links towards that on the FIBO Website? OK: We probably could. So if you click on FIBO 1 and find a fragment, what does it look like? OK shows generated URL from the thing. JG: We generate output for all the products. If we can generate the RL from this thing, we can add that URL to the corresponding fragment to the right version of FIBO. OK will write up findings and try to capture that. Also needs to talk to the authors who may be willing to accept some changes if needed.
PR: when would one use LD Fragment in place of FYN in the existing material? OK this is just another way to use this. PR: The files are RDF with additional metadata relating to that fragment? OK: Yes. JG you can look for a term in the Glossary, find it, look for it in the Fragment server, and then navigate around to discover aspects. PR so why would you not use FYN for that? JG: in the future website we need to put it together as one user experience with multiple options. Until that is the case, these LD fragments will provide an alternative way of looking at the content. JG: Also put out these different formats and see what people find useful.
PR still to know, if people click on LDF vs FYN, what do they see that is different? OK demonstrates what they would see. Looks like a screen shot text. Must mean something to someone. PR notes it has invented some blank node IDs. Was this us, or the creation of the Quads? OK: Not sure, this pulls in what was created in the published stuff. JG: there is hover text that shows the right URL. What happens when you click on a class? Try Relative Price. DA Has a question re Memento v StarDog. In SD there are services doing diffs, looking at Prov and so on. e.g. when was the last time a triple with the subject LegalEntity changed? Is this a use case for Memento? See e.g. the issue with Contracts.rdf yesterday, where the maturity level was removed. Will have needed to know when that changed. Did so via GitHub and source files. Does Memento make this easier? OK: Memento does not have the version hygiene. DA: StarDog does support the above use case - tells you in what version a given thing changed in a given way. You can query for that. JG: Memento is more for showing different data sets next to each other. DA can do this in StarDog - create a diff, cache it and query it. PR: Diff only works if the serializing is done. DA: in the database you can do a diff on the triples themselves not just the order of them.
PR: See refs to Virtuoso, is this based on that? OK not this just has a link to Virtuoso, as it has links to DBPedia. This is because there is some relationship between DBPedia and Virtuoso. Did not say what, probably well known to some. JG: show in which versions of FIBO a given class appears. DA the class is easy, also need restrictions and other patterns. JG you can do that in Memento versions. Can find a given pattern. See e.g. 'Find matching triples' feature. Click on that Get a result you can scroll through. JG: See where the pattern we were looking for exists in FIBO1 but not in FIBO2. As expected.
Chrome has a glitch for some functions in this. Was trying a job to reset the instance, which seemed not to be working but it turned out to be Chrome caching behavior. If I try to go to an individual FIBO dataset it does not come back with something. Seems a bug. Possibly config error, OK needs to find out from the authors what to do about this.
JG: can pick a query with a drop down box. If we can translate all our user stories into queries, with test data sets, then people can do queries. This may makes a easier way for people to access FIBO. JG first focus on publishing all he FIBO versions so they all show up as data sources. Then later we can publish standard test data next to that. Plus the user story SPARQL queries as above. Then we can show users what they can do with FIBO.
PR: How often do we publish e.g. once per pull request? JG the Latest version will always over-write the previous version so it's seen as one version. This only changes when you add a tag, such that that tag represents a different version.
DW to DA: What do we do to get his on spec? DA do we add this as a Product and put it on the front to the spec page (that table of Products)? It runs on a different server, but the link can be in that table. DA we would treat this similarly to the Glossary, as a derivative product. Figure out where in the table that goes, what to call it, a one line gloss of what it is, and further information addressing the questions of what it is for and who the audience is.
OK will write that page. OK will write this up for release notes, and get more info from the authors.
Next item: DA recent process breach. DA happy with the email responses on this. Not much discussion needed except timescales and effort. There is now an update to Jenkins that helps do things in a more orderly fashion. DA we need the team to understand when and where they need to carry out their own hygiene tests. This is a refinement for the human workflow. DA: Make this a priority for Q4 (not for Sept 30).
Hygiene test themselves need to be set. JG why call these hygiene test sand not just tests? DA: Hygiene is one kind of test. There are others. PR we must distinguish levels of tests, whereby Hygiene tests are a minimum, while there are other tests things need to pass in order to qualify for Release. DA this is what TC proposed using SHACL (previously used SPIN). We can talk about the test in SPARQL. Also need data. DA we should make a priority of that. Maybe use or adapt data from the existing PoCs. Identify levels of test, Human workflow, Test data, SPARQL
PR: Do we have many use cases we can convert to SPARQL. JG call these User Stories not Use Cases. There is a standard for how these are set out. Then translate each User Story to SPARQL. PR: We need both: US = what should it achieve (broad), versus a detailed policy for Use Cases. DW; The BTDM has both.
Next: DW re-writing BTDM. The colors are gone. Maturity levels are in there. Also plan to make that document smaller, with pointers to separate docs maintained on a regular bass Some of the things he wants to link to, are now dead links. Also ends up with a circular trail of links via JIRA and Wiki. Is there a way to prevent this happening automatically?
JG: Latest: https://github.com/edmcouncil/fibo-infra/tree/master/doc/publishing
DW: Policy for maintaining this stuff has to be written. The fact that JG posted the link above in the chat, is not the same as having it turn up automatically where we need it. DW also we need not to have links via GitHub since now not all users will be using that or have access. DW: Wiki page - Featured Pages - Recently Update (not very recent content!) Need to clean up our act
JG: Confluence is slack about versioning, t's used more for interaction. Need things that are proper documents. JG: the markdown documents in Git would need to be picked up and translated via HTML into readable docs on our spec site.
DW: for developers Guide comments are below - fine during creation but once created that's not OK, People need a place to go and look at a thing and trust it, without being within the team. Same applies or tests incl hygiene. Today there is only one policy there, from this week, plus a lot of discussion of potential policies.
Primer: with DW. BTDM - with DW. CCM Guide: with MB. Dev Guide: with DA: Need to identify the relationships between these so we can assemble them rather than cut and paste.
PR also think of the audiences for each. Else there is (as at present) duplications between them. Also, there are things not written. MB also the FCT guidelines just started. We need versionless links to the latest of each thing, that don't change. Earlier drafts needed for history but mostly the public doesn't need to see that history. DW: The public stuff needs to be on the spec server. That way it comes under control of a publishing process just like ontology changes. This would also apply to front matter for OMG submissions.
CCM Guide status: MB working on this, will prioritize over getting CCM round trip actually working.
Next: welcoming Karthik. DA: there were loads of things that he was struggling with on e.g. running what on what machine. Was less effective use of DA time. Has handed a bunch of these tasks to Karthik (KG). Any discussion of those tasks? KG: Looked at a couple of these, they look straight forward. DA these may be related e.g. where a thing did not run to completion or failed to mount a disk. So they may turn out to be the same task. Need this by Sept 30 (actually the Sept 25 drop dead date). Tasks are listed in email from DA to KG. These are about why this or that process failed, why a file didn't get moved and so on, which are probably related. These are a blocker for the Glossary. See also JG comments this morning - Glossary should be a versioned artifact like anything else. Up to now this was hand build so it was not possible to do an auto Glossary for each version. Once this is fully automatic we can do that.
PR: are we on track to freeze things on 25th? DA think so.
Round Tripping: JL has asked Nilesh to work with MB. MB it is not realistic to dump all of what comes out of CCM onto what is in GitHub. CCM is a target of ontologies from the OWL. Not the other way around,
JL there is another case, which is that when you are authoring in CCM you output that ontology in OWL along with any other affected ones.
DECISION: we only use CCM as a stand alone edit tool - only emit the affected ontologies. Other owl tools do the same.
Decisions:
we only use CCM as a stand alone edit tool - only emit the affected ontologies. Other owl tools do the same.
Action items
Omar Khan (Unlicensed) Do a write up for spec on linked fragments release.