August 10th, 2006, 16:51 PM
Old and Cranky
Software Mines Internet To Identify Music Piracy
By Laurie Sullivan, TechWeb Technology News
Identity Systems will soon roll out software that lets media companies from music to movies search through unstructured text on the Internet to identify piracy.
The software called "Unstructured Data Module" applies analytics and algorithms to scan for hidden relationships in streams of digital data. Beyond information found in traditional databases and spreadsheets, the software digs into e-mails, file directory listings, search results for peer-to-peer (P2P) sites, and lists of top downloaded songs on Web sites, a company executive said Wednesday.
Today, the music industry and movie studios work mainly from neatly organized structured data files. But as the move to digital music accelerates and opens new channels, the industry must work in nontraditional formats to find the slightest variation in song titles and artist names as part of copyright compliance.
"The music industry has a pervasive challenge because song titles and artists' names can be among the hardest to match when the data set is large," said Ramesh Menon, Identity Systems' North American operations director. "We're finding more and more that music societies and publishers are seeking this type of solution as a critical part of copyright compliance."
EMI Music Publishing's copyright system, written in the COBOL programming language, runs on an AS400. The Identity Systems software runs separately on a standalone server.
The two platforms are used to manage royalties and monitor piracy by matching a list of songs received from music associations and music download sites, explains Alec Malyon, IT Director of Royalty & Copyright Systems at EMI. "I extract the master file from the AS400, add an algorithm for the titles, and use the Identity Systems software to match writers against the file we receive," he said.
Music download sites, such as Loudeye, which Nokia acquired this week for $60 million, sends EMI a file of roughly 5 million song titles to match against the music labels master file, for example. The record label has been working with Identity Systems to build a platform. What once took up to six days to process the data, now takes one.
EMI also has copyright watchdogs monitoring Internet sites to verify licenses and rights, taking action when necessary, such as last week's lawsuit brought against file-sharing site LimeWire LLC by some of the world's biggest record labels, including EMI Group Plc.
The complaint filed in Manhattan federal court claimed LimeWire's software allows users to download music without paying for it. It is the latest in a string of lawsuits the music industry has filed in an attempt to slow Internet piracy since the U.S. Supreme Court ruled last year that content companies can take legal action against technology firms that encourage copyright infringement.
"In the case of an infringement, if there's a settlement or ruling, the company sends us a file," Malyon said. "Say the judgment is for $2 million. We match the list with our database to determine which artists and songs we represent, and that's how much they need to pay us."
Typically companies are growing their own, but its' an opportunity for the software vendors to help companies mine and secure the music so it can't be passed around, said Susan Feldman, research vice president of content technologies at IDC. "The Identity Systems software enables companies to identify variations in the data by doing a fuzzy match," she said.
Founded in 1986 as Search Software America (SSA), Identity Systems became a wholly owned subsidiary of Nokia in 2006.