{"id":13260,"date":"2017-07-26T16:55:41","date_gmt":"2017-07-26T16:55:41","guid":{"rendered":"http:\/\/www.doanduyhai.com\/blog\/?p=13260"},"modified":"2017-08-02T15:27:24","modified_gmt":"2017-08-02T15:27:24","slug":"gremlin-recipes-2-sql-to-gremlin","status":"publish","type":"post","link":"https:\/\/www.doanduyhai.com\/blog\/?p=13260","title":{"rendered":"Gremlin recipes: 2 &#8211; SQL to Gremlin"},"content":{"rendered":"<p>This blog post is the 2<sup>nd<\/sup> from the series <strong>Gremlin Recipes<\/strong>. It is recommended to read the previous recipe first: <a href=\"https:\/\/www.doanduyhai.com\/blog\/?p=13224\" target=\"_blank\"><strong>1 &#8211; Gremlin as a Stream<\/strong><\/a><\/p>\n<p><!--more--><\/p>\n<h1>I KillrVideo dataset<\/h1>\n<p>To illustrate this series of recipes, you need first to create the schema for <strong>KillrVideo<\/strong> and import the data. See <a href=\"https:\/\/www.doanduyhai.com\/blog\/?p=13224#killrvideo_dataset\" target=\"_blank\"><strong>here<\/strong><\/a> for more details.<\/p>\n<p>The graph schema of this dataset is :<\/p>\n<p><iframe loading=\"lazy\" src=\"https:\/\/s3.amazonaws.com\/datastax-graph-schema-viewer\/index.html#\/?schema=killr_video_small.json\" height=\"600px\" width=\"100%\"><\/iframe><\/p>\n<h1>II Simple SQL queries translation<\/h1>\n<p>Let&#8217;s say we have a table <strong>movie<\/strong> with some properties: <\/p>\n<ul>\n<li>country: Text<\/li>\n<li>duration: Int<\/li>\n<li>movieId: Text<\/li>\n<li>production: Text<\/li>\n<li>title: Text<\/li>\n<li>year: Int<\/li>\n<\/ul>\n<p>Now we want to translate the SQL query <code><strong>SELECT * FROM Movie LIMIT 10<\/strong><\/code> into <strong>Gremlin<\/strong> traversal:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;).  \/\/ Iterator&lt;Movie&gt;   \r\n    valueMap().         \/\/ Iterator&lt;Map&lt;String == property label, Object = property value&gt;&gt;\r\n    limit(10)\r\n==&gt;{duration=[104], country=[United States], year=[2003], production=[Paramount Pictures], movieId=[m75], title=[The Italian Job]}\r\n==&gt;{duration=[111], country=[United States], year=[1998], production=[20th Century Fox], movieId=[m617], title=[Great Expectations]}\r\n==&gt;{duration=[160], country=[United States], year=[2007], production=[Warner Bros. Pictures], movieId=[m337], title=[The Assassination of Jesse James By The Coward Robert Ford]}\r\n==&gt;{duration=[108], country=[United States], year=[1997], production=[20th Century Fox, Brandywine Productions], movieId=[m518], title=[Alien Resurrection]}\r\n==&gt;{duration=[175], country=[United States], year=[1954], production=[Warner Bros], movieId=[m777], title=[A Star Is Born]}\r\n==&gt;{duration=[109], country=[United States], year=[1990], production=[Carolco], movieId=[m917], title=[Total Recall]}\r\n==&gt;{duration=[127], country=[United States], year=[2012], production=[Fox 2000 Pictures, Haishang Films], movieId=[m546], title=[Life of Pi]}\r\n==&gt;{duration=[84], country=[United States], year=[1950], production=[20th Century Fox], movieId=[m468], title=[The Gunfighter]}\r\n==&gt;{duration=[115], country=[United States], year=[2000], production=[Franchise Pictures, Jonathan D. Krane, JTP Films], movieId=[m909], title=[Battlefield Earth]}\r\n==&gt;{duration=[108], country=[United States], year=[2004], production=[Focus Features], movieId=[m904], title=[Eternal Sunshine of the Spotless Mind]}\r\n<\/pre>\n<p>The step <code><strong>valueMap()<\/strong><\/code> is useful to get the properties label along with their values. If we had used the step <code><strong>values()<\/strong><\/code> we would only have the raw properties values:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().               \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;). \/\/ Iterator&lt;Movie&gt;   \r\n    limit(2).\r\n    values()           \/\/ iterator.flatMap(movie -&gt; movies.properties()) == Iterator&lt;Object&gt;\r\n==&gt;m75\r\n==&gt;The Italian Job\r\n==&gt;2003\r\n==&gt;104\r\n==&gt;United States\r\n==&gt;Paramount Pictures\r\n==&gt;m617\r\n==&gt;Great Expectations\r\n==&gt;1998\r\n==&gt;111\r\n==&gt;United States\r\n==&gt;20th Century Fox\r\n<\/pre>\n<p>When using <code><strong>values()<\/strong><\/code> step, each movie vertex is transformed into a collection of its properties values and <strong>Gremlin<\/strong> implicitly applies a <code><strong>flatMap()<\/strong><\/code> operation so that the result is an <code><strong>Iterator&lt;Object&gt;<\/strong><\/code> instead of a nested structure <code><strong>Iterator&lt;Collection&lt;Object&gt;&gt;<\/strong><\/code>. Consequently we need to put the <code><strong>limit(2)<\/strong><\/code> <strong>before<\/strong> calling <code><strong>values()<\/strong><\/code> otherwise the limit will be applied on the collection of movies properties.<\/p>\n<p>Now, instead of fetching all properties of the movies, what if we only want to read the <strong>title<\/strong> ? <code><strong>SELECT title  FROM Movie LIMIT 10<\/strong><\/code>:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().               \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;). \/\/ Iterator&lt;Movie&gt;   \r\n    values(&quot;title&quot;).   \/\/ iterator.map(movie -&gt; movie.getTitle()) == Iterator&lt;String&gt;\r\n    limit(10)          \r\n==&gt;The Italian Job\r\n==&gt;Great Expectations\r\n==&gt;The Assassination of Jesse James By The Coward Robert Ford\r\n==&gt;Alien Resurrection\r\n==&gt;A Star Is Born\r\n==&gt;Total Recall\r\n==&gt;Life of Pi\r\n==&gt;The Gunfighter\r\n==&gt;Battlefield Earth\r\n==&gt;Eternal Sunshine of the Spotless Mind\r\n<\/pre>\n<p>Easy. To select multiple columns <code><strong>SELECT country,duration,title,year FROM Movie LIMIT 10<\/strong><\/code> we can use of course <code><strong>values(\"country\",\"duration\",\"title\",\"year\")<\/strong><\/code> step but then all the properties will be mixed into a stream collection. It is better to rely on <code><strong>valueMap()<\/strong><\/code> to have a stream of key\/value pair:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                                           \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;).                             \/\/ Iterator&lt;Movie&gt;   \r\n    valueMap(&quot;country&quot;,&quot;duration&quot;,&quot;title&quot;,&quot;year&quot;). \/\/ Iterator&lt;Map&lt;String, Object&gt;&gt;\r\n    limit(10)          \r\n==&gt;{duration=[104], country=[United States], year=[2003], title=[The Italian Job]}\r\n==&gt;{duration=[111], country=[United States], year=[1998], title=[Great Expectations]}\r\n==&gt;{duration=[160], country=[United States], year=[2007], title=[The Assassination of Jesse James By The Coward Robert Ford]}\r\n==&gt;{duration=[108], country=[United States], year=[1997], title=[Alien Resurrection]}\r\n==&gt;{duration=[175], country=[United States], year=[1954], title=[A Star Is Born]}\r\n==&gt;{duration=[109], country=[United States], year=[1990], title=[Total Recall]}\r\n==&gt;{duration=[127], country=[United States], year=[2012], title=[Life of Pi]}\r\n==&gt;{duration=[84], country=[United States], year=[1950], title=[The Gunfighter]}\r\n==&gt;{duration=[115], country=[United States], year=[2000], title=[Battlefield Earth]}\r\n==&gt;{duration=[108], country=[United States], year=[2004], title=[Eternal Sunshine of the Spotless Mind]}\r\n<\/pre>\n<p>The usage of <code><strong>valueMap()<\/strong><\/code> has been thoroughly explained in <a href=\"https:\/\/www.doanduyhai.com\/blog\/?p=13224\" target=\"_blank\"><strong>1 &#8211; Gremlin as a Stream<\/strong><\/a>, please refer to it if you have any difficulty to understand the transformation.<\/p>\n<h1>III SQL WHERE clause in Gremlin traversal<\/h1>\n<p>Gremlin translates simply the WHERE clause of SQL language into the step <code><strong>filter()<\/strong><\/code> and <code><strong>has()<\/strong><\/code> steps. <code><strong>has()<\/strong><\/code> is a simple filter on vertex\/label property whereas <code><strong>filter<\/strong><\/code> can be applied to a traversal. Let&#8217;s say we want the query <code><strong>SELECT title FROM Movie WHERE year = 2000 LIMIT 5<\/strong><\/code>. The <strong>Gremlin<\/strong> traversal would be: <\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().               \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;). \/\/ Iterator&lt;Movie&gt;   \r\n    has(&quot;year&quot;, 2000). \/\/ iterator.filter(movie -&gt; movie.getYear() == 2000)    \r\n    values(&quot;title&quot;).   \/\/ Iterator&lt;String&gt;\r\n    limit(5)\r\n==&gt;Cast Away\r\n==&gt;What Women Want\r\n==&gt;Dancer in the Dark\r\n==&gt;Crouching Tiger, Hidden Dragon\r\n==&gt;Snatch\r\n<\/pre>\n<p>For multiple filters, we can just chain the <code><strong>has()<\/strong><\/code> step as many time as needed, <code><strong>SELECT title,country,year FROM Movie WHERE year >= 2009 AND year <= 2011 AND country='United States' LIMIT 5<\/strong><\/code>:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                                \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;).                  \/\/ Iterator&lt;Movie&gt;   \r\n    has(&quot;year&quot;, gte(2009)).             \/\/ iterator.filter(movie -&gt; movie.getYear()&gt;=2000)    \r\n    has(&quot;year&quot;, lte(2011)).             \/\/ iterator.filter(movie -&gt; movie.getYear()&lt;=2011)\r\n    has(&quot;country&quot;, &quot;United States&quot;).    \/\/ iterator.filter(movie -&gt; movie.getCountry().equals(&quot;United States&quot;))    \r\n    valueMap(&quot;title&quot;,&quot;country&quot;,&quot;year&quot;). \/\/ Iterator&lt;String&gt;\r\n    limit(5)\r\n==&gt;{country=[United States], year=[2010], title=[Alice in Wonderland]}\r\n==&gt;{country=[United States], year=[2010], title=[Blue valentine]}\r\n==&gt;{country=[United States], year=[2011], title=[Friends with Benefits]}\r\n==&gt;{country=[United States], year=[2010], title=[Tangled]}\r\n==&gt;{country=[United States], year=[2009], title=[Nine]}\r\n<\/pre>\n<p>Now if we want some more complex filtering using traversal, we can use the <code><strong>filter()<\/strong><\/code> step:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().\r\n    hasLabel(&quot;movie&quot;).\r\n    filter(out(&quot;director&quot;).has(&quot;name&quot;, &quot;Woody Allen&quot;)).\r\n    values(&quot;title&quot;).\r\n    limit(5)\r\n==&gt;Love and Death\r\n==&gt;Match Point\r\n==&gt;Take the Money and Run\r\n==&gt;Scoop\r\n==&gt;Deconstructing Harry\r\n<\/pre>\n<h1>IV GROUP BY and ORDER BY translation<\/h1>\n<p>Let&#8217;s start with <code><strong>ORDER BY<\/strong><\/code> clause. The <code><strong>SELECT title, duration FROM Movie ORDER BY duration DESC LIMIT 5<\/strong><\/code> query translates into:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().\r\n    hasLabel(&quot;movie&quot;).\r\n    order().by(&quot;duration&quot;, decr).\r\n    valueMap(&quot;title&quot;,&quot;duration&quot;).\r\n    limit(5)\r\n==&gt;{duration=[238], title=[Gone With the Wind]}\r\n==&gt;{duration=[225], title=[Once Upon a Time in America]}\r\n==&gt;{duration=[222], title=[Lawrence of Arabia]}\r\n==&gt;{duration=[211], title=[Ben-Hur]}\r\n==&gt;{duration=[205], title=[Seven Samurai]}\r\n<\/pre>\n<p>The <strong>decr<\/strong> keyword means decrementing == DESCENDING. So far so good, it is also possible to order by multiple properties. Let&#8217;s say we want to order the movies first by their title and then their duration (the example is quite contrived but just good enough for explanation): <code><strong>SELECT title, duration FROM Movie ORDER BY title ASC, duration DESC LIMIT 5<\/strong><\/code><\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().\r\n    hasLabel(&quot;movie&quot;).\r\n    order().\r\n      by(&quot;title&quot;, incr).\r\n      by(&quot;duration&quot;, decr).\r\n    valueMap(&quot;title&quot;,&quot;duration&quot;).\r\n    limit(5)\r\n==&gt;{duration=[96], title=[(500) Days of Summer]}\r\n==&gt;{duration=[97], title=[10 Things I Hate about You]}\r\n==&gt;{duration=[109], title=[10,000 B.C.]}\r\n==&gt;{duration=[79], title=[101 Dalmatians]}\r\n==&gt;{duration=[95], title=[12 Angry Men]}\r\n<\/pre>\n<p>Now let&#8217;s add <code><strong>GROUP BY<\/strong><\/code> into the mix. We want to have the top 2 longest movies for each country. The SQL query is:<br \/>\n<code><strong><br \/>\nSELECT country, max(duration) AS longest_duration<br \/>\nFROM Movie<br \/>\nGROUP BY country<br \/>\nORDER BY max(duration) DESC<br \/>\nLIMIT 5<br \/>\n<\/strong><\/code><\/p>\n<p>How would we translate this query into <strong>Gremlin<\/strong> traversal ? Grouping by country is a piece of cake, if you are not familiar with grouping in <strong>Gremlin<\/strong>, I recommend reading the previous post at section <strong><a href=\"https:\/\/www.doanduyhai.com\/blog\/?p=13224#group_by\" target=\"_blank\">III Grouping the stream<\/a><\/strong><\/p>\n<p>So the obvious <strong>Gremlin<\/strong> traversal would be:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                            \/\/ Iterator&lt;Vertex&gt;\r\n    hasLabel(&quot;movie&quot;).              \/\/ Iterator&lt;Movie&gt;   \r\n    group().\r\n      by(&quot;country&quot;).                \/\/ Iterator&lt;Map&lt;String == country, Collection&lt;Movie&gt;&gt;&gt;\r\n      by(values(&quot;duration&quot;).max()). \/\/ Iterator&lt;Map&lt;String == country, Long == duration of longest movie&gt;&gt;\r\n    order().\r\n      by(values, decr).       \r\n    limit(5)\r\n==&gt;{Argentina=126, Hong Kong=120, United States=238, Japan=205, United Kingdom=222, Soviet Union (USSR)=165, Spain=143, Russia=114, New Zealand=121, Canada=96, South Korea=143, Austria=127, Ireland=135, China=89, Taiwan=119, Brazil=130, Germany - West Germany=94, Denmark=177, Italy=165, Mexico=150, South Africa=111, France=180, Australia=165, Germany=153}\r\n<\/pre>\n<p>The result is more than surprising, in fact <strong>it is completely wrong<\/strong>. Some explanation is needed.<\/p>\n<p>The step <code><strong>order().by(values, decr)<\/strong><\/code> would order the stream of data (thus <code><strong>Iterator&lt;Map&lt;String,Long&gt;&gt;<\/strong><\/code> ) by its content e.g. order all the <code><strong>Maps<\/strong><\/code> inside the <code><strong>Iterator<\/strong><\/code> in descending order. The problem is that <strong>this ordering is useless and has no effect<\/strong> because inside the the <code><strong>Iterator<\/strong><\/code>, there is only a single <code><strong>Map<\/strong><\/code>. The same remark applies to the <code><strong>limit(5)<\/strong><\/code>. Since there is only one <code><strong>Map<\/strong><\/code> any number for limit will yield the same result anyway.<\/p>\n<p>To convince ourselves, we can just run the traversal to count the number of elements inside the the <code><strong>Iterator<\/strong><\/code>:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                            \/\/ Iterator&lt;Vertex&gt;\r\n    hasLabel(&quot;movie&quot;).              \/\/ Iterator&lt;Movie&gt;   \r\n    group().\r\n      by(&quot;country&quot;).                \/\/ Iterator&lt;Map&lt;String == country, Collection&lt;Movie&gt;&gt;&gt;\r\n      by(values(&quot;duration&quot;).max()). \/\/ Iterator&lt;Map&lt;String == country, Long == duration of longest movie&gt;&gt;\r\n    order().\r\n      by(values, decr).       \r\n    count()\r\n==&gt;1\r\n<\/pre>\n<p>To fix this problem, <strong>Gremlin<\/strong> offers 2 solutions<\/p>\n<h3>A Using &#8220;local&#8221; keyword<\/h3>\n<p>What we want is not to order the items inside the <code><strong>Iterator<\/strong><\/code> but to order the entries inside each <code><strong>Map<\/strong><\/code>. For that we need to use the <strong>local<\/strong> keyword<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                            \/\/ Iterator&lt;Vertex&gt;\r\n    hasLabel(&quot;movie&quot;).              \/\/ Iterator&lt;Movie&gt;   \r\n    group().\r\n      by(&quot;country&quot;).                \/\/ Iterator&lt;Map&lt;String == country, Collection&lt;Movie&gt;&gt;&gt;\r\n      by(values(&quot;duration&quot;).max()). \/\/ Iterator&lt;Map&lt;String == country, Long == duration of longest movie&gt;&gt;\r\n    order(local).\r\n      by(values, decr).       \r\n    limit(local, 5)\r\n==&gt;{United States=238, United Kingdom=222, Japan=205, France=180, Denmark=177}\r\n<\/pre>\n<p>Now we have the correct result. <code><strong>order(local).by(values, decr)<\/strong><\/code> will order each nested <code><strong>Map<\/strong><\/code> by its value (please note that you can also decide to order by key using <code><strong>by(keys,  decr)<\/strong><\/code> as well). <code><strong>limit(local, 5)<\/strong><\/code> will only take the first 5 elements inside the <code><strong>Map<\/strong><\/code> in the <code><strong>Iterator<\/strong><\/code>.<\/p>\n<h3>B Using &#8220;unfold()&#8221; step<\/h3>\n<p>The problem with our initial traversal (the one without the <code><strong>local<\/strong><\/code> keyword) was that we performed the ordering and limit on the wrong level. With the step <code><strong>unfold()<\/strong><\/code> which is quite similar to the Java stream <code><strong>flatMap()<\/strong><\/code> operation, we can extract all map entries from the internal <code><strong>Map<\/strong><\/code> and move them to the <code><strong>Iterator<\/strong><\/code> level so that the grouping and limit work again:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                            \/\/ Iterator&lt;Vertex&gt;\r\n    hasLabel(&quot;movie&quot;).              \/\/ Iterator&lt;Movie&gt;   \r\n    group().\r\n      by(&quot;country&quot;).                \/\/ Iterator&lt;Map&lt;String == country, Collection&lt;Movie&gt;&gt;&gt;\r\n      by(values(&quot;duration&quot;).max()). \/\/ Iterator&lt;Map&lt;String == country, Long == duration of longest movie&gt;&gt;\r\n    unfold().                       \/\/ Iterator&lt;MapEntry&lt;String == country, Long == duration of longest movie&gt;&gt;\r\n    order().\r\n      by(values, decr).       \r\n    limit(5)\r\n==&gt;United States=238\r\n==&gt;United Kingdom=222\r\n==&gt;Japan=205\r\n==&gt;France=180\r\n==&gt;Denmark=177\r\n<\/pre>\n<p>And of course we have the same results as previously. <\/p>\n<p>Please note that the display format does differ though. <\/p>\n<p>The explanation is simple. By using <code><strong>limit(local, 5)<\/strong><\/code> <strong>Gremlin<\/strong> will display the first element of the <code><strong>Iterator<\/strong><\/code> with only 5 elements inside which is the <code><strong>Map<\/strong><\/code> truncated to its top 5 element, thus the JSON display of a map content <code><strong>{United States=238, United Kingdom=222, Japan=205, France=180, Denmark=177}<\/strong><\/code>.<\/p>\n<p>Whereas with <code><strong>unfold()<\/strong><\/code>, the <code><strong>limit(5)<\/strong><\/code> is applied on the <code><strong>Iterator<\/strong><\/code> itself so <strong>Gremlin<\/strong> will show you only the first 5 map entries and that explains the display formatting: <\/p>\n<p><code><br \/>\n==>United States=238<br \/>\n==>United Kingdom=222<br \/>\n==>Japan=205<br \/>\n==>France=180<br \/>\n==>Denmark=177<br \/>\n<\/code><\/p>\n<h1>V JOIN translation<\/h1>\n<p>Let&#8217;s now tackle the biggest feature of relational databases: joins. With <strong>Gremlin<\/strong>, joins translate into simple traversals and you barely realise it. Let&#8217;s take the following join query:<\/p>\n<p><code><strong><br \/>\nSELECT Movie.title, Genre.name FROM Movie<br \/>\nINNER JOIN Movie_Genre ON Movie.movieId = Movie_Genre.movieId<br \/>\nINNER JOIN Genre ON Movie_Genre.genreId = Genre.genreId<br \/>\nWHERE Genre.name = \"Sci-Fi\"<br \/>\nLIMIT 10<br \/>\n<\/strong><\/code><\/p>\n<p>We want to display the 10 movies title and genre name of all movies whose genres contain at least &#8220;Sci-Fi&#8221;.<\/p>\n<p>A naive Gremlin traversal would be:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n  V(). \r\n  hasLabel(&quot;movie&quot;).\r\n  filter(out(&quot;belongsTo&quot;).has(&quot;name&quot;, &quot;Sci-Fi&quot;)).   \r\n  values(&quot;title&quot;, out(&quot;belongsTo&quot;).values(&quot;name&quot;)).\r\n  limit(10)\r\nNo signature of method: org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.DefaultGraphTraversal.values() is applicable for argument types: (java.lang.String, org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.DefaultGraphTraversal) values: [title, [VertexStep(OUT,[belongsTo],vertex), PropertiesStep([name],value)]]\r\nType ':help' or ':h' for help.\r\nDisplay stack trace? [yN]\t\r\n<\/pre>\n<p>There are 2 interesting remarks. First the JOIN in SQL is achieved by the traversal <code><strong>filter(out(\"belongsTo\").has(\"name\", \"Sci-Fi\"))<\/strong><\/code>. This traversal also performs the filtering on <em>Genre name == Sci-Fi<\/em>. Second, the traversal generates an exception, which basically says that the step <code><strong>values()<\/strong><\/code> does not support the arguments <code><strong>[title, [VertexStep(OUT,[belongsTo],vertex), PropertiesStep([name],value)]]<\/strong><\/code>. Decoded for mere mortals, it means that you cannot use inner traversal inside the <code><strong>values()<\/strong><\/code> step. Indeed <code><strong>values()<\/strong><\/code> can only be used to fetch the properties of a vertex\/edge.<\/p>\n<p>This lets us with the pending question: <em>how to display the movie title alongside with its genre(s) ?<\/em><\/p>\n<p>For this, we need to use the <code><strong>project()<\/strong><\/code> step. Projection in Gremlin is very similar to projection in SQL and this step can accept inner traversal, which is what we want.<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n  V(). \r\n  hasLabel(&quot;movie&quot;).\r\n  filter(out(&quot;belongsTo&quot;).has(&quot;name&quot;, &quot;Sci-Fi&quot;)). \/\/ JOIN ... WHERE Genre.name = &quot;Sci-Fi&quot;  \r\n  project(&quot;title&quot;, &quot;genres&quot;).                     \/\/ SELECT\r\n    by(values(&quot;title&quot;)).                          \/\/   Movie.Title\r\n    by(out(&quot;belongsTo&quot;).values(&quot;name&quot;).fold()).   \/\/   Genre.name\r\n  limit(10)\r\n==&gt;{title=Alien Resurrection, genres=[Sci-Fi, Horror]}\r\n==&gt;{title=Total Recall, genres=[Sci-Fi, Action]}\r\n==&gt;{title=Battlefield Earth, genres=[Sci-Fi, Action]}\r\n==&gt;{title=Eternal Sunshine of the Spotless Mind, genres=[Romance, Sci-Fi, Comedy, Drama]}\r\n==&gt;{title=Back to the Future. Part III, genres=[Sci-Fi, Western, Adventure, Comedy, Fantasy]}\r\n==&gt;{title=The Avengers, genres=[Sci-Fi, Action, Fantasy]}\r\n==&gt;{title=Back to the Future, genres=[Sci-Fi, Adventure, Comedy, Fantasy]}\r\n==&gt;{title=El gran marciano, genres=[Sci-Fi, Comedy]}\r\n==&gt;{title=Source Code, genres=[Sci-Fi, Thriller, Action]}\r\n==&gt;{title=2001: A Space Odyssey, genres=[Sci-Fi]}\t\r\n<\/pre>\n<p>The step <code><strong>project()<\/strong><\/code> needs some explanation. First the arguments to this step represent the alias\/label of the values being projected (the equivalent in SQL land is <code><strong>SELECT xxx AS label<\/strong><\/code>). Here the labels are <em>&#8220;title&#8221;<\/em> and <em>&#8220;genres&#8221;<\/em> but you can put whatever String value you want as long as they remain meaningful for the reader.<\/p>\n<p>Next the <code><strong>project()<\/strong><\/code> step is followed by some <code><strong>by()<\/strong><\/code> <strong><a href=\"http:\/\/tinkerpop.apache.org\/docs\/3.2.5\/reference\/#by-step\" target=\"_blank\">modulators<\/a><\/strong>. There should be as many <code><strong>by()<\/strong><\/code> modulators as there are arguments to the <code><strong>project()<\/strong><\/code> step. Here we need 2 <code><strong>by()<\/strong><\/code> modulators. <\/p>\n<p>The first <code><strong>by()<\/strong><\/code> specifies on which value we want to project the label <em>&#8220;title&#8221;<\/em> to. Obviously we want to read the <em>&#8220;title&#8221;<\/em> property of the Movie vertex so <code><strong>by(values(\"title\"))<\/strong><\/code> is straightforward.<\/p>\n<p>The second <code><strong>by()<\/strong><\/code> specifies on which value we want to project the label <em>&#8220;genres&#8221;<\/em> to. Since the genres are not intrinsic properties of the Movie vertex we need a traversal: <code><strong>by(out(\"belongsTo\").values(\"name\").fold())<\/strong><\/code>. And because traversing the edge <em>&#8220;belongsTo&#8221;<\/em> can lead us to a 1-to-N relationship (a movie can belong to multiple genres) we need the <code><strong>fold()<\/strong> <\/code> operator to concatenate all the genre names into a single collection to be displayed with each movie title.<\/p>\n<p>And that&#8217;s all folks! <strong>Do not miss the other Gremlin recipes in this series<\/strong>.<\/p>\n<p>If you have any question about <strong>Gremlin<\/strong>, find me on the <strong><a href=\"http:\/\/datastaxacademy.slack.com\" target=\"_blank\">datastaxacademy.slack.com<\/a><\/strong>, channel <strong>dse-graph<\/strong>. My id is <em>@doanduyhai<\/em>   <\/p>\n","protected":false},"excerpt":{"rendered":"<p>This blog post is the 2nd from the series Gremlin Recipes. It is recommended to read the previous recipe first: 1 &#8211; Gremlin as a Stream<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[58,10],"tags":[],"_links":{"self":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/13260"}],"collection":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13260"}],"version-history":[{"count":22,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/13260\/revisions"}],"predecessor-version":[{"id":13348,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/13260\/revisions\/13348"}],"wp:attachment":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13260"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13260"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13260"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}