{"id":288,"date":"2006-09-10T16:43:00","date_gmt":"2006-09-10T16:43:00","guid":{"rendered":"http:\/\/www.dresan.com\/blog\/?p=288"},"modified":"2006-09-10T16:43:00","modified_gmt":"2006-09-10T16:43:00","slug":"large-scale-semantic-networks","status":"publish","type":"post","link":"https:\/\/dresan.com\/blog\/2006\/09\/10\/large-scale-semantic-networks\/","title":{"rendered":"Large-Scale Semantic Networks"},"content":{"rendered":"<p>Recently as both a followon to my thesis research and as a part of my work for the Search Engine That Starts With A G, I&#8217;ve been investigating a number of papers on graph theory as applied to semantic networks and Web link graphs.   This work isn&#8217;t &#8220;new&#8221; per se, since these models and techniques for graph analysis go back years if not decades, but it&#8217;s only recently &#8211; when vast real-world networks have become available in machine-readable form along with the computer power necessary to analyze them &#8211; that enough work has been done to illuminate how important it is to analyze the properties of networks in detail.<\/p>\n<p>One of the most interesting papers I&#8217;ve encountered so far was <a href=\"http:\/\/citeseer.ist.psu.edu\/newman03structure.html\">The Structure and Function of Complex Networks<\/a>, a survey of mathematical and empirical studies of networks that I had wish had been available when I was doing my thesis work.  The most important result I think to come out of recent graph theory is that simple uniform and random models of networks don&#8217;t tell the whole story &#8211; now, we have new mathematical tools for modeling a wide range of graph architectures, such as:<\/p>\n<ul><\/p>\n<li><b>Small World Networks<\/b>. A traditional graph model says nothing about how far you may need to travel to find an arbitrary node in the graph.  Think of square tiles on a floor &#8211; each tile is connected to four others, but the number of steps needed to reach any tile is a function of the distance. But anyone who&#8217;s played the parlor game &#8220;Six Degrees of Kevin Bacon&#8221; knows that real-world human relationships aren&#8217;t arranged this way: everyone is connected to everyone else in the world by a very short chain of relationships &#8211; on average, you can reach almost anyone in the world in only six or seven handshakes.  This shows up in graph analysis as the &#8220;mean geodesic distance&#8221; between two nodes in a graph, and is an important property we should measure about our networks as they grow to determine what kind of network structure we are really dealing with.  At the very least, a random graph structure shows small-world properties more similar to real-world graphs than uniform graphs, and we should consider using them over uniform models as a basis for analyses.<\/li>\n<li><b>Scale-Free Networks<\/b>. Both uniform networks &#8211; where every node has the same link structure &#8211; and random networks &#8211; where nodes are connected at random to each other across the graph &#8211; have a definite scale or rough average size.  Nodes in a random graph are like people: each one is unique, but their heights are distributed over a defnite scale so there are few people shorter than three feet and no people taller than nine.  Real world networks don&#8217;t have this definite scale: instead, they look the &#8220;same&#8221; no matter what size scale you&#8217;re looking at.  Nodes in a real world graph are like the distribution of city sizes:  <a href=\"http:\/\/interconnected.org\/notes\/2002\/12\/Origin_of_power_laws.txt\">for every city there are four times as many cities at half that size<\/a>.  This shows up technically in the &#8220;degree distribution&#8221;: the statistical pattern of the number of links on each node.  This will no doubt have significant effects on processes operating over networks like spreading activation.<\/li>\n<p><\/ul>\n<p>There are more issues in graph theory than I can readily summarize, including issues like resilience to deletion, models of growth, and so on; many of which are directly relevant to studies of semantic networks and processes that operate over them.<\/p>\n<p>Another paper, <a href=\"http:\/\/citeseer.ist.psu.edu\/641287.html\">The Large-Scale Structure of Semantic Networks<\/a>, applies these techniques to real-world semantic network models drawn from sources such as <a href=\"http:\/\/wordnet.princeton.edu\/\">WordNet<\/a> and <a href=\"http:\/\/thesaurus.reference.com\/\">Roget&#8217;s Thesaurus<\/a>.  This paper, like the survey article <a href=\"http:\/\/www.dbmi.columbia.edu\/publications\/docs\/fulltext\/DBMI-2006-003.pdf#search=%22large%20scale%20semantic%20networks%22\">Graph Theoretic Modeling of Large Scale Semantic Networks<\/a>, seems to find that real semantic networks have scale-free, small-world properties that aren&#8217;t found in the simpler mathematical models that people such as Francis (cough) used in his thesis.<\/p>\n<p>SO, anyone interested in semantic networks or spreading activation as a tool for modeling human cognition or as a representation scheme for an intelligent system would do well to follow up on these references and begin an analysis of their systems based on these &#8220;new&#8221; techniques (many of which have been around for a while, but sadly hadn&#8217;t reached everyone in the semantic network community (or, at least, hadn&#8217;t reached <b>me<\/b>) until more recently.<\/p>\n<p>So check them out!<br \/>-the Centaur<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Recently as both a followon to my thesis research and as a part of my work for the Search Engine That Starts With A G, I&#8217;ve been investigating a number&#8230;<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[8],"class_list":["post-288","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-intelligence","ratio-2-1","entry"],"_links":{"self":[{"href":"https:\/\/dresan.com\/blog\/wp-json\/wp\/v2\/posts\/288","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dresan.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dresan.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dresan.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/dresan.com\/blog\/wp-json\/wp\/v2\/comments?post=288"}],"version-history":[{"count":0,"href":"https:\/\/dresan.com\/blog\/wp-json\/wp\/v2\/posts\/288\/revisions"}],"wp:attachment":[{"href":"https:\/\/dresan.com\/blog\/wp-json\/wp\/v2\/media?parent=288"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dresan.com\/blog\/wp-json\/wp\/v2\/categories?post=288"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dresan.com\/blog\/wp-json\/wp\/v2\/tags?post=288"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}