The Latest Imperilment for Young Girls... LEGOs?

It was about 7AM when I saw the tweet. I was closing down my Twitter and Facebook to return to the seething build up that's going on -- and not looking forward to it this morning. My head ached, from lack of sleep, I was overly cranky -- for several related reasons -- and honestly I was in one of those, 'don't fu*&ing do it man' moods. Then I saw the tweet. A flash really as I closed down the Twitter. I caught two words, 'Nightmare' and 'LEGO'. I also recognized the source, so I knew the context. With a new click I brought the Twitter back up to find out what the full message was.

To be honest, I had already decided that I didn't like it, and that it was inline with several of the messages and campaigns I've been coming across. Reading the text I did exactly what I tell people never to do -- I didn't follow through or check the story. After the night I had and the sources I had read for the twelve hours before that -- I "knew" what was going on and responded in kind. Recognizing the sender account didn't help. It added an under tow of betrayal to the mix and just for future reference, that mix is never a good one to be making decisions on -- although I can't picture anyone not doing so -- there's nothing more stimulating to knee jerk response than justifiable anger. 

Two more relevant facts were in my mind as I reread the Tweet. One, is that LEGOs was practically the founder of non-gender specific marketing. They started this back in the late 70s. I recall it because my younger sister saw one of the ads and instantly wanted a full set of building blocks. At the time I wasn't that interested -- I had a new set of JARTS -- I still can't believe that my parents thought that what a seven year old needed was a set of iron javelins. It never crossed my mind at the time that building blocks were "boy toys" or rather NOT girl toys. What Sandra did with those blocks was amazing to me. I never could get them to become what was in my head, and she created whatever she wanted to create with an ease that was disturbing  to me. I could throw those JARTS though, so it was all good -- except for the neighbor's 1967 Corvette it was all good...

Aftermath is on the way

I'm hoping that by the cover you can tell that Aftermath is a thriller. Aftermath is also the first novel I've been able to put out under my own name in a long time. For the last 5+ years I've been ghost writing 7-10 novels a year. They were all short, and none of them were ideas of my own, but they were fun to do and steady income. 

Aftermath is a thriller, and will be hitting Amazon and places beyond very soon now. Just basically reminding you that I'm primarily a writer (reminding myself of that as well) and that I'm still committed to creating stories to engage your darker imagination.

It is Pi Day .. Be good to Yourself



Is there anything better than pie?


Getting the most out of a Tweet

If you are going to bother sending out links to bring people to your blog, you might as well spend a little more time and get some information from the effort. And if you have Analytics from Google, then it really is an easy thing to set up. Check out this page on Google for the link tracking, bookmark, and then come back for a few tips. SEO isn't that hard and for the most part, you don't need much to get some serious information.


Opendata, Software, OpenEdu -- Life is good


Here's a list of some large datasets, and the kind of software that can handle these sets on a laptop.

I'll be back with more, there is some amazing stuff out here now.

Resources


1. Open Data

DBpedia - Querying Wikipedia Like a Database
DBpedia is a community effort to extract structured information from Wikipedia editions in over 90 languages and to make the resulting knowledge base available on the Web. The DBpedia knowledge base currently describes more than 3.5 million things, out of which 1.6 million are classified in a consistent ontology. It is one of the most comprehensive multi-lingual knowledge bases that currently exist and has developed into an interlinking hub for the Web of Data. The knowledge base is widely used by research projects as well as in industry. More information about the project is found on the DBpedia website.
Duration: Active since 2007
Project partners: Universität Leipzig, OpenLink Software, and a world-wide community of developers and mapping editors.
W3C Linking Open Data
W3C Linking Open Data community project supports and loosely coordinates the extension of the Web with a global data space by publishing open-license datasets as RDF and by setting data links between data items within different data sources. The project maintains the LOD dataset catalogue on CKAN as well as tool listings in the W3C ESW wiki. It regularly publishes statistics about the LOD data cloud and maintains the LOD cloud diagram. More information about the project is found on the LOD website.
Duration: Active since 2007
Project partners: Over 100 world-wide including the Massachusetts Institute of Technology (USA), DERI (Ireland), Talis (UK), University of Southampton (UK), Open University (UK), OpenLink Software (USA), BBC (UK), Geonames (USA). 
Web Data Commons
More and more websites embed structured data describing for instance products, people, organizations, places, events, resumes, and cooking recipes into their HTML pages using encoding standards such as Microformats, Microdatas and RDFa. The Web Data Commons project extracts all Microformat, Microdata and RDFa data from the Common Crawl web corpus, the largest and most up-to-data web corpus that is currently available to the public, and provide the extracted data for download in the form of RDF-quads and also in the form of CSV-tables for common entity types. More information about the project is found at WebDataCommons.org.
Duration: Active since March 2012
Project partner: Karlsruhe Institut of Technology (Germany)
Web Data Commons - Hyperlink Graph
The project provides a large hyperlink graph for public download and analyses the topology of the graph. The WDC Hyperlink Graph has been extracted from the Common Crawl 2012 web corpus and covers 3.5 billion web pages and 128 billion hyperlinks between these pages. To the best of our knowledge, this graph is the largest hyperlink graph that is available to the public outside companies such as Google, Yahoo, and Microsoft. The graph and the results of the analysis are found at http://webdatacommons.org/hyperlinkgraph
 Duration: Active since November 2013
JudaicaLink
Scholarly reference works like encyclopediae, glossars, or catalogs function as guides to a scholarly domain and as anchor points and manifestations of scholarly work. On the web of Linked Data, they can take on a key function to interlink resources related to the described concepts. Within the context of JudaicaLink, we provide support to publish and interlink existing reference works of the Jewish culture and history as Linked Data. More information about the project is found at JudaicaLink.org.
Duration: Active since May 2013
Project partner: European Association  for Jewish Culture (France, UK)

2. Open Source Software

Silk - Link Discovery Framework
The Silk framework is a tool for discovering relationships between data items within different Linked Data sources. Data publishers can use Silk to set RDF links from their data sources to other data sources on the Web. Silk can also be used as an identity resolution component within Linked Data applications. Silk provides a declarative language for expressing identity resolution heuristics and implements a sophisticated blocking method (MultiBlock). There is a single machine and a Hadoop-based implementation available. More information about the project is found on the Silk website.
Duration: Active since 2009
RapidMiner Linked Open Data Extension
The RapidMiner Linked Open Data Extension is an extension to the open source data mining software RapidMiner. It allows using data from Linked Open Data both as an input for data mining as well as for enriching existing datasets with background knowledge. More information about the extension as well as its use cases are found on the project's website.
Duration: Active since 2013
D2RQ Plattform - Accessing Relational Databases as Virtual RDF Graphs
The D2RQ Platform is a system for accessing relational databases as virtual, read-only RDF graphs. It offers RDF-based access to the content of relational databases without having to replicate it into an RDF store. Using D2RQ you can Using D2RQ you can: 1. query a non-RDF database using SPARQL; 2. access the content of the database as Linked Data over the Web; 3. create custom dumps of the database in RDF formats for loading into an RDF store; 4. access information in a non-RDF database using the Apache Jena API.  The D2RQ Plattform has been downloaded over 15.000 times from Sourceforge. More information about the plattform is found on the D2RQ website.
Duration: Active since 2004
Project partners: DERI (Ireland)
OEM distributor: TopBraid (USA)
LDIF - Linked Data Integration Framework
The LDIF – Linked Data Integration Framework is a Hadoop-based framework for integrating and cleansing large amounts of web and enterprise data. LDIF provides an expressive mapping language, an identity resolution component, as well as data quality assessment and data fusion modules. More information about the project is found on the  LDIF website.
Duration: Active since June 2011
ALCOMO - Applying Logical Constraints to Match Ontologies
ALCOMO is a project that has been developed by Christian Meilicke in the context of his Phd. It is a debugging system that allows to transform incoherent alignments in coherent alignments by removing some correspondences from the alignment. The removed part of the alignment is called a diagnosis. It is complete in the sense that it detects any kind of incoherence in SHIN(D) ontologies. At the same time a computed diagnosis is always minimal in the sense that the tool never removes too much, i.e., the removed subset of the alignment is always a minimal hitting set over all conflicts.The system is availabe under MIT license and can be downloaded here.
 Duration: Available since 2012
Semtinel - Thesaurus analysis beyond numbers
Semtinel is a graphical thesaurus analysis and maintenance system developed mainly by Kai Eckert as part of his dissertation. It also formed the technical basis for several master theses and student research projects. The software is available at Semtinel.org.
Duration: Available since 2008
WDC - Extraction Framework
The Web Data Commons - Extraction Framework is used by the Web Data Commons project to extract Microdata, Microformats and RDFa data, Web graphs, and HTML tables from the web crawls provided by the Common Crawl Foundation. The framework provides an easy to use basis for the distributed processing of large web crawls using Amazon EC2 cloud services. The framework is published under the terms of the Apache license and can be simply customized to perform different data extraction tasks. More information and the download instructions can be found on the Web Data Commons website.
Duration: Available since July 2014

3. Benchmarks

Berlin SPARQL Benchmark (BSBM)
The SPARQL Query Language for RDF and the SPARQL Protocol for RDF are implemented by a growing number of storage systems. As SPARQL is taken up by the community there is a growing need for benchmarks to compare the performance of storage systems that expose SPARQL endpoints via the SPARQL protocol.   The Berlin SPARQL Benchmark (BSBM) defines a suite of benchmarks for comparing the performance of these systems across architectures. The benchmark is built around an e-commerce use case in which a set of products is offered by different vendors and consumers have posted reviews about products. The benchmark query mix illustrates the search and navigation pattern of a consumer looking for a product. More information about the benchmark is found on the BSBM website.
Duration: Active since 2008
OAEI Anatomy and Library Track
The Ontology Alignment Evaluation Initiative (OAEI) is a coordinated international initiative to assess strengths and weaknesses of alignment/matching systems and to compare the performance of techniques. Since 2006 we offered for the first time the Anatomy track. This track consists of finding alignments between the Adult Mouse Anatomy and a part of the NCI Thesaurus (describing the human anatomy). The task is placed in a domain where we find large, carefully designed ontologies that are described in technical terms. Since 2012 we are offering a second track, called the Library track. The Library track is a real-word task to match the STW and the TheSoz thesaurus. Both provide a vocabulary for economic resp. social science subjects and are used by libraries for indexation and retrieval. The latest versions of the datasets as well as the tools to process them are available viahttp://oaei.ontologymatching.org/2012/.
Duration: Since 2006 as part of the annual OAEI campaign















Thoughts on Bullshit

Das ist nicht nur nicht richtig, es ist nicht einmal falsch!
(It is not only not right, it is not even wrong)


Some big minds tell us that it is impossible for someone to lie unless he thinks he knows the truth. So this make Bullshit different than lying. Producing bullshit requires no such conviction. A person who lies is responding to the truth, and for his side of it, he believes he understands what the truth is.

When an honest man speaks, he says only what he believes to be true; and for the liar, he believes his statements to be false. For the bullshitter, however, all these bets are off:

The Theory of Planned Behavior

Introduction The Theory of Planned Behavior (TPB) provides a comprehensive framework for understanding and predicting human actions in a pla...