For a month or two, I have been exploring Neo4J, a graph database built for storing huge amount of data. Other popular graph database, that I will be dwelling into is InfiniteGraph from Objectivity.
I have also been working on Kundera (A JPA 2.0 based object-datastore mapping library for NoSQL datastores) as a key contributor. It already supports popular databases like Cassandra, HBase MongoDB and Redis. So next thought on our mind was to support this wonderful and popular graph database, you guessed it right: Neo4J.
JPA specification was not written keeping in mind NoSQL datastores; and graph databases are altogether a different story, they take object mapping challenges to a next level. It made us sweat and argue countless hours on how to fit JPA into graph world. We dived deeper into both JPA and Neo4J, and here is how our journey unfold and key decision made…
SpringData for Neo4J is another similar effort that attempts POJO based development for Neo4J. In terms of ease of use, our goals converged. It introduces its own annotations for two category of entities: NodeEntity and RelationshipEntity. We were constrained with using JPA standards and decided not to introduce any new annotation.
Next item on our mind was to make rules for entity definition capable enough for users to express graph structure in the form of java entity classes. Here is what we thought best suited and was possible:
1. Both Node and Relationship POJOs would be annotated with @Entity.
2. Because of graph’s very nature, relationships between entities is always @ManyToMany. So, we decided to discard other forms of relationships for the sake of simplicity.
3. Biggest challenge was to fit “Relationship Entity” between different classes of “Node Entities”.
Take, for example case of Actor and Movie nodes entities joined via relationship entity Role. Actor is related to Movie via Role entity. Till now we had been keeping a List or Set of entities as relations. But this approach wasn’t sufficient as there was a third dimension here (in the form of Role).
Map Collections in JPA came to our rescue. This means Actor entity class can have a Map, containing Role as key and Movie as value. So far so good. Next thing was to choose relationship type and direction.
Relationship type could be read from @MapKeyJoinColumn annotation. Direction was implicit because in bidirectional relationship, you always have an owning side of entity. (mappedBy is specified at the other side). So relationship direction could easily be derived as OUTGOING from Actor to Movie.
4. Next consideration was to replicate flexibility of navigation that Neo4J provides in jumping from one node to other nodes via relationships. Bidirectional relationship made it possible to navigate from Actor to Movies via Role and vice versa. We decided to let user define Incoming and Outgoing Node entity attributes in relationship entity too (in addition to relationship entity’s own attributes), that would make it easy to navigate from Role to both Actors and Movies.
5. In my previous experience with other NoSQL databases, it didn’t matter whether database was on localhost or some other machine. We provided host and port for creating connection and use it just like RDBMS. In Neo4J, We’ve got two ways:
- Embedded Database – In case database is expected to run on the same machine (faster but less flexible)
- REST interface – In case database is on some other machine. (slower but more flexible)
This means, we required to create two translations for user CRUD calls and give users a way of choosing Embedded/ REST.
6. Next item on our plate was how to interpret JPA queries and run them on indexes. Since indexes in Neo4J are stored in Lucene in simplest configuration (and it was easy to build a JPA to Lucene conversion engine), we decided to translate all JPA queries into Lucene ones and run them directly on indices.
We identified three types of Native queries though. Lucene, Cypher and Gremlin. We started with Lucene first because it was simplest to implement and decided to implement support for remaining ones in subsequent releases.
So, summing this up all, It was challenging but rewarding to marry both of these heterogeneous world off. Once this gets fructifies, we shall seek for more refinement and additions. I shall post Kundera-Neo4J documentation links and code examples once it’s released.
MongoDB supports 2-dimensional geospatial indexes. You can watch MongoSF (May 2011) presentation to understand it at a basic level.
In this article, I will help you quickly write Geospatial queries described in above presentation using Java programming language. I assume you have MongoDB server up and running on your machine. Source code for this article is available at this Github project.
A repository of Gegraphical Places
Below Java class stores major cities in California, US in the form of String array. We’ll use this array in the next program.
public class Places
{
public static final String[] cities = new String[]
{ "Palos Verdes Estates", "Los Altos Hills", "Hillsborough", "Monte Sereno", "Villa Park", "Palo Alto",
"Belvedere", "Los Altos", "Rolling Hills", "Montecito", "Piedmont", "Foster City", "Yorba Linda",
"Mission Canyon", "Saratoga", "Orinda", "Manhattan Beach", "Pleasanton", "Imperial", "Goleta", "Tiburon",
"Tustin Foothills", "Rancho Palos Verdes", "Mountain View", "La Habra Heights", "Newport Beach",
"Toro Canyon", "Agoura Hills", "Redondo Beach", "Menlo Park", "Mill Valley", "Indian Wells", "Moraga",
"Ross", "La Palma", "Kensington", "Hermosa Beach", "Thousand Oaks", "Belmont", "Rolling Hills Estates",
"Loyola", "Summerland", "Santa Monica", "Rossmoor", "Irvine", "Lafayette", "Laguna Niguel", "Torrance",
"Fairbanks Ranch", "Cupertino", "Santa Barbara", "Portola Valley", "Woodside", "San Ramon", "Santa Ynez",
"Emerald Lake Hills", "Angwin", "El Segundo", "Orange", "West Menlo Park", "West Bishop", "Ladera Heights",
"Huntington Beach", "Atherton", "Coronado", "Danville", "Diamond Bar", "Rancho Santa Fe", "Chino Hills",
"Clayton", "Walnut", "San Anselmo", "Solvang", "Cerritos", "Blackhawk-Camino Tassajara",
"Highlands-Baywood Park", "Fountain Valley", "Westlake Village", "Sunnyvale", "Poway", "Del Monte Forest",
"Brea", "San Carlos", "Los Gatos", "Rancho Santa Margarita", "Camarillo", "Cypress", "Newport Coast",
"San Joaquin Hills", "Folsom", "Arroyo Grande", "Malibu", "Sausalito", "Del Rio", "Green Valley",
"Mission Viejo", "Aliso Viejo", "Stanford", "Encinitas", "Rancho Mirage" };
}
Geospatial Queries Program
This program run following methods in that order:
1. addPlaces() – Adds above listed 100 California cities on a 10X10 plane. Data is stored in MongoDB collection named “places”.
2. findWithinCircle() – Finds all cities in a circle whose center is (5,5) and are within a radius of 1 unit (of our imaginary plane)
3. findWithinBox() – Finds all cities within a rectangle formed between coordinates (4,4) and (6,6)
4. FindWithinPolygon() – Finds all cities within a polygon (triangle in this case) formed from 3 coordinates.
5. findCenterSphere() – Same as 2, but query is spherical in nature.
6. findNear() – Finds all cities near (4,4) having a maximum distance of 2 unit.
7. findNearSphere() – Same as above, but query is spherical in nature.
import java.net.UnknownHostException;
import java.util.ArrayList;
import java.util.List;
import com.mongodb.BasicDBList;
import com.mongodb.BasicDBObject;
import com.mongodb.DBAddress;
import com.mongodb.DBCollection;
import com.mongodb.DBCursor;
import com.mongodb.DBObject;
import com.mongodb.Mongo;
import com.mongodb.MongoException;
/**
* Example code for Geospatial queries in MongoDB
* @author amresh.singh
*/
public class GeospatialExample
{
public static final String dbName = "geospatial";
public static final String host = "127.0.0.1";
public static final int port = 27017;
public static final String collectionName = "places";
public static final String indexName = "geospatialIdx";
Mongo mongo;
DBCollection collection;
private Mongo getMongo() {
try
{
mongo = new Mongo(new DBAddress(host, port, dbName));
}
catch (MongoException e)
{
e.printStackTrace();
}
catch (UnknownHostException e)
{
e.printStackTrace();
}
return mongo;
}
public static void main(String[] args)
{
new GeospatialExample().runExample();
}
private void runExample()
{
collection = getMongo().getDB(dbName).getCollection(collectionName);
collection.ensureIndex(new BasicDBObject("loc", "2d"), indexName);
addPlaces();
findWithinCircle();
findWithinBox();
findWithinPolygon();
findCenterSphere();
findNear();
findNearSphere();
}
private void findWithinCircle()
{
System.out.println("findWithinCircle\n----------------------\n");
List circle = new ArrayList();
circle.add(new double[] { 5, 5 }); // Centre of circle
circle.add(1); // Radius
BasicDBObject query = new BasicDBObject("loc", new BasicDBObject("$within",
new BasicDBObject("$center", circle)));
printOutputs(query);
}
private void findWithinBox()
{
System.out.println("findWithinBox\n----------------------\n");
List box = new ArrayList();
box.add(new double[] { 4, 4 }); //Starting coordinate
box.add(new double[]{6,6}); // Ending coordinate
BasicDBObject query = new BasicDBObject("loc", new BasicDBObject("$within",
new BasicDBObject("$box", box)));
printOutputs(query);
}
private void findWithinPolygon()
{
System.out.println("findWithinPolygon\n----------------------\n");
List polygon = new ArrayList();
polygon.add(new double[] { 3, 3 }); //Starting coordinate
polygon.add(new double[]{8,3}); // Ending coordinate
polygon.add(new double[]{6,7}); // Ending coordinate
BasicDBObject query = new BasicDBObject("loc", new BasicDBObject("$within",
new BasicDBObject("$polygon", polygon)));
printOutputs(query);
}
private void findNear() {
System.out.println("findNear\n----------------------\n");
BasicDBObject filter = new BasicDBObject("$near", new double[] { 4, 4 });
filter.put("$maxDistance", 2);
BasicDBObject query = new BasicDBObject("loc", filter);
printOutputs(query);
}
private void findNearSphere() {
System.out.println("findNearSphere\n----------------------\n");
BasicDBObject filter = new BasicDBObject("$nearSphere", new double[] { 5, 5 });
filter.put("$maxDistance", 0.06);
// Radius of the earth: 3959.8728
BasicDBObject query = new BasicDBObject("loc", filter);
printOutputs(query);
}
private void findCenterSphere() {
System.out.println("findCenterSphere\n----------------------\n");
List circle = new ArrayList();
circle.add(new double[] { 5, 5 }); // Centre of circle
circle.add(0.06); // Radius
BasicDBObject query = new BasicDBObject("loc", new BasicDBObject("$within",
new BasicDBObject("$centerSphere", circle)));
printOutputs(query);
}
public void printOutputs(BasicDBObject query)
{
DBCursor cursor = collection.find(query);
List<BasicDBList> outputs = new ArrayList<BasicDBList>();
while (cursor.hasNext())
{
DBObject result = cursor.next();
System.out.println(result.get("name") + "--->" + result.get("loc"));
outputs.add((BasicDBList) result.get("loc"));
}
for (int y = 9; y >= 0; y--)
{
String s = "";
for (int x = 0; x < 10; x++)
{
boolean found = false;
for (BasicDBList obj : outputs)
{
double xVal = (Double) obj.get(0);
double yVal = (Double) obj.get(1);
if (yVal == y && xVal == x)
{
found = true;
}
}
if(found) {
s = s + " @";
} else {
s = s + " +";
}
}
System.out.println(s);
}
}
private void addPlaces()
{
System.out.println("Adding places...");
for (int i = 0; i < 100; i++)
{
double x = i % 10;
double y = Math.floor(i / 10);
addPlace(collection, Places.cities[i], new double[] { x, y });
}
System.out.println("All places added");
}
private void addPlace(DBCollection collection, String name, final double[] location)
{
final BasicDBObject place = new BasicDBObject();
place.put("name", name);
place.put("loc", location);
collection.insert(place);
}
}
MongoDB Console (after running program)
amresh@ubuntu:/usr/local/mongodb-linux-i686-1.8.1/bin$ ./mongo
MongoDB shell version: 1.8.1
connecting to: test
> use geospatial;
switched to db geospatial
> db.places.find();
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb40a"), "name" : "Palos Verdes Estates", "loc" : [ 0, 0 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb40b"), "name" : "Los Altos Hills", "loc" : [ 1, 0 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb40c"), "name" : "Hillsborough", "loc" : [ 2, 0 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb40d"), "name" : "Monte Sereno", "loc" : [ 3, 0 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb40e"), "name" : "Villa Park", "loc" : [ 4, 0 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb40f"), "name" : "Palo Alto", "loc" : [ 5, 0 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb410"), "name" : "Belvedere", "loc" : [ 6, 0 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb411"), "name" : "Los Altos", "loc" : [ 7, 0 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb412"), "name" : "Rolling Hills", "loc" : [ 8, 0 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb413"), "name" : "Montecito", "loc" : [ 9, 0 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb414"), "name" : "Piedmont", "loc" : [ 0, 1 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb415"), "name" : "Foster City", "loc" : [ 1, 1 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb416"), "name" : "Yorba Linda", "loc" : [ 2, 1 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb417"), "name" : "Mission Canyon", "loc" : [ 3, 1 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb418"), "name" : "Saratoga", "loc" : [ 4, 1 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb419"), "name" : "Orinda", "loc" : [ 5, 1 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb41a"), "name" : "Manhattan Beach", "loc" : [ 6, 1 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb41b"), "name" : "Pleasanton", "loc" : [ 7, 1 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb41c"), "name" : "Imperial", "loc" : [ 8, 1 ] }
{ "_id" : ObjectId("50747ebf44ae3dfd6b8eb41d"), "name" : "Goleta", "loc" : [ 9, 1 ] }
has more
Program Output
Adding places... All places added findWithinCircle ---------------------- Emerald Lake Hills--->[ 5.0 , 5.0] Lafayette--->[ 5.0 , 4.0] Emerald Lake Hills--->[ 5.0 , 5.0] Santa Ynez--->[ 4.0 , 5.0] Santa Ynez--->[ 4.0 , 5.0] Lafayette--->[ 5.0 , 4.0] Danville--->[ 5.0 , 6.0] Danville--->[ 5.0 , 6.0] Angwin--->[ 6.0 , 5.0] Angwin--->[ 6.0 , 5.0] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @ + + + + + + + + @ @ @ + + + + + + + + @ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + findWithinBox ---------------------- Emerald Lake Hills--->[ 5.0 , 5.0] Lafayette--->[ 5.0 , 4.0] Emerald Lake Hills--->[ 5.0 , 5.0] Lafayette--->[ 5.0 , 4.0] Santa Ynez--->[ 4.0 , 5.0] Santa Ynez--->[ 4.0 , 5.0] Irvine--->[ 4.0 , 4.0] Irvine--->[ 4.0 , 4.0] Coronado--->[ 4.0 , 6.0] Coronado--->[ 4.0 , 6.0] Danville--->[ 5.0 , 6.0] Danville--->[ 5.0 , 6.0] Laguna Niguel--->[ 6.0 , 4.0] Laguna Niguel--->[ 6.0 , 4.0] Angwin--->[ 6.0 , 5.0] Angwin--->[ 6.0 , 5.0] Diamond Bar--->[ 6.0 , 6.0] Diamond Bar--->[ 6.0 , 6.0] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @ @ @ + + + + + + + @ @ @ + + + + + + + @ @ @ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + findCenterSphere ---------------------- Emerald Lake Hills--->[ 5.0 , 5.0] Lafayette--->[ 5.0 , 4.0] Emerald Lake Hills--->[ 5.0 , 5.0] Lafayette--->[ 5.0 , 4.0] Kensington--->[ 5.0 , 3.0] Kensington--->[ 5.0 , 3.0] Santa Ynez--->[ 4.0 , 5.0] Santa Ynez--->[ 4.0 , 5.0] San Ramon--->[ 3.0 , 5.0] San Ramon--->[ 3.0 , 5.0] Irvine--->[ 4.0 , 4.0] Irvine--->[ 4.0 , 4.0] La Palma--->[ 4.0 , 3.0] La Palma--->[ 4.0 , 3.0] Rossmoor--->[ 3.0 , 4.0] Rossmoor--->[ 3.0 , 4.0] Ross--->[ 3.0 , 3.0] Ross--->[ 3.0 , 3.0] Newport Beach--->[ 5.0 , 2.0] Newport Beach--->[ 5.0 , 2.0] La Habra Heights--->[ 4.0 , 2.0] La Habra Heights--->[ 4.0 , 2.0] Woodside--->[ 2.0 , 5.0] Woodside--->[ 2.0 , 5.0] Santa Monica--->[ 2.0 , 4.0] Santa Monica--->[ 2.0 , 4.0] Huntington Beach--->[ 2.0 , 6.0] Huntington Beach--->[ 2.0 , 6.0] Atherton--->[ 3.0 , 6.0] Atherton--->[ 3.0 , 6.0] Cerritos--->[ 3.0 , 7.0] Cerritos--->[ 3.0 , 7.0] Coronado--->[ 4.0 , 6.0] Coronado--->[ 4.0 , 6.0] Blackhawk-Camino Tassajara--->[ 4.0 , 7.0] Blackhawk-Camino Tassajara--->[ 4.0 , 7.0] Rancho Santa Margarita--->[ 4.0 , 8.0] Rancho Santa Margarita--->[ 4.0 , 8.0] Danville--->[ 5.0 , 6.0] Danville--->[ 5.0 , 6.0] Highlands-Baywood Park--->[ 5.0 , 7.0] Highlands-Baywood Park--->[ 5.0 , 7.0] Camarillo--->[ 5.0 , 8.0] Camarillo--->[ 5.0 , 8.0] Toro Canyon--->[ 6.0 , 2.0] Toro Canyon--->[ 6.0 , 2.0] Hermosa Beach--->[ 6.0 , 3.0] Hermosa Beach--->[ 6.0 , 3.0] Laguna Niguel--->[ 6.0 , 4.0] Laguna Niguel--->[ 6.0 , 4.0] Thousand Oaks--->[ 7.0 , 3.0] Thousand Oaks--->[ 7.0 , 3.0] Torrance--->[ 7.0 , 4.0] Torrance--->[ 7.0 , 4.0] Angwin--->[ 6.0 , 5.0] Angwin--->[ 6.0 , 5.0] El Segundo--->[ 7.0 , 5.0] El Segundo--->[ 7.0 , 5.0] Fairbanks Ranch--->[ 8.0 , 4.0] Fairbanks Ranch--->[ 8.0 , 4.0] Orange--->[ 8.0 , 5.0] Orange--->[ 8.0 , 5.0] Diamond Bar--->[ 6.0 , 6.0] Diamond Bar--->[ 6.0 , 6.0] Fountain Valley--->[ 6.0 , 7.0] Fountain Valley--->[ 6.0 , 7.0] Rancho Santa Fe--->[ 7.0 , 6.0] Rancho Santa Fe--->[ 7.0 , 6.0] Westlake Village--->[ 7.0 , 7.0] Westlake Village--->[ 7.0 , 7.0] Cypress--->[ 6.0 , 8.0] Cypress--->[ 6.0 , 8.0] Chino Hills--->[ 8.0 , 6.0] Chino Hills--->[ 8.0 , 6.0] + + + + + + + + + + + + + + @ @ @ + + + + + + @ @ @ @ @ + + + + @ @ @ @ @ @ @ + + + @ @ @ @ @ @ @ + + + @ @ @ @ @ @ @ + + + + @ @ @ @ @ + + + + + + @ @ @ + + + + + + + + + + + + + + + + + + + + + + + findNear ---------------------- Irvine--->[ 4.0 , 4.0] Irvine--->[ 4.0 , 4.0] Santa Ynez--->[ 4.0 , 5.0] Santa Ynez--->[ 4.0 , 5.0] Lafayette--->[ 5.0 , 4.0] Lafayette--->[ 5.0 , 4.0] La Palma--->[ 4.0 , 3.0] La Palma--->[ 4.0 , 3.0] Rossmoor--->[ 3.0 , 4.0] Rossmoor--->[ 3.0 , 4.0] Emerald Lake Hills--->[ 5.0 , 5.0] Emerald Lake Hills--->[ 5.0 , 5.0] San Ramon--->[ 3.0 , 5.0] San Ramon--->[ 3.0 , 5.0] Kensington--->[ 5.0 , 3.0] Kensington--->[ 5.0 , 3.0] Ross--->[ 3.0 , 3.0] Ross--->[ 3.0 , 3.0] Santa Monica--->[ 2.0 , 4.0] Santa Monica--->[ 2.0 , 4.0] La Habra Heights--->[ 4.0 , 2.0] La Habra Heights--->[ 4.0 , 2.0] Coronado--->[ 4.0 , 6.0] Coronado--->[ 4.0 , 6.0] Laguna Niguel--->[ 6.0 , 4.0] Laguna Niguel--->[ 6.0 , 4.0] + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @ + + + + + + + + @ @ @ + + + + + + @ @ @ @ @ + + + + + + @ @ @ + + + + + + + + @ + + + + + + + + + + + + + + + + + + + + + + + + + findNearSphere ---------------------- Emerald Lake Hills--->[ 5.0 , 5.0] Emerald Lake Hills--->[ 5.0 , 5.0] Santa Ynez--->[ 4.0 , 5.0] Santa Ynez--->[ 4.0 , 5.0] Angwin--->[ 6.0 , 5.0] Angwin--->[ 6.0 , 5.0] Lafayette--->[ 5.0 , 4.0] Lafayette--->[ 5.0 , 4.0] Danville--->[ 5.0 , 6.0] Danville--->[ 5.0 , 6.0] Coronado--->[ 4.0 , 6.0] Coronado--->[ 4.0 , 6.0] Diamond Bar--->[ 6.0 , 6.0] Diamond Bar--->[ 6.0 , 6.0] Irvine--->[ 4.0 , 4.0] Irvine--->[ 4.0 , 4.0] Laguna Niguel--->[ 6.0 , 4.0] Laguna Niguel--->[ 6.0 , 4.0] San Ramon--->[ 3.0 , 5.0] San Ramon--->[ 3.0 , 5.0] El Segundo--->[ 7.0 , 5.0] El Segundo--->[ 7.0 , 5.0] Kensington--->[ 5.0 , 3.0] Kensington--->[ 5.0 , 3.0] Highlands-Baywood Park--->[ 5.0 , 7.0] Highlands-Baywood Park--->[ 5.0 , 7.0] Atherton--->[ 3.0 , 6.0] Atherton--->[ 3.0 , 6.0] Rancho Santa Fe--->[ 7.0 , 6.0] Rancho Santa Fe--->[ 7.0 , 6.0] Rossmoor--->[ 3.0 , 4.0] Rossmoor--->[ 3.0 , 4.0] Torrance--->[ 7.0 , 4.0] Torrance--->[ 7.0 , 4.0] Blackhawk-Camino Tassajara--->[ 4.0 , 7.0] Blackhawk-Camino Tassajara--->[ 4.0 , 7.0] Fountain Valley--->[ 6.0 , 7.0] Fountain Valley--->[ 6.0 , 7.0] La Palma--->[ 4.0 , 3.0] La Palma--->[ 4.0 , 3.0] Hermosa Beach--->[ 6.0 , 3.0] Hermosa Beach--->[ 6.0 , 3.0] Cerritos--->[ 3.0 , 7.0] Cerritos--->[ 3.0 , 7.0] Westlake Village--->[ 7.0 , 7.0] Westlake Village--->[ 7.0 , 7.0] Ross--->[ 3.0 , 3.0] Ross--->[ 3.0 , 3.0] Thousand Oaks--->[ 7.0 , 3.0] Thousand Oaks--->[ 7.0 , 3.0] Woodside--->[ 2.0 , 5.0] Woodside--->[ 2.0 , 5.0] Orange--->[ 8.0 , 5.0] Orange--->[ 8.0 , 5.0] Newport Beach--->[ 5.0 , 2.0] Newport Beach--->[ 5.0 , 2.0] Camarillo--->[ 5.0 , 8.0] Camarillo--->[ 5.0 , 8.0] Huntington Beach--->[ 2.0 , 6.0] Huntington Beach--->[ 2.0 , 6.0] Chino Hills--->[ 8.0 , 6.0] Chino Hills--->[ 8.0 , 6.0] Santa Monica--->[ 2.0 , 4.0] Santa Monica--->[ 2.0 , 4.0] Fairbanks Ranch--->[ 8.0 , 4.0] Fairbanks Ranch--->[ 8.0 , 4.0] Rancho Santa Margarita--->[ 4.0 , 8.0] Rancho Santa Margarita--->[ 4.0 , 8.0] Cypress--->[ 6.0 , 8.0] Cypress--->[ 6.0 , 8.0] La Habra Heights--->[ 4.0 , 2.0] La Habra Heights--->[ 4.0 , 2.0] Toro Canyon--->[ 6.0 , 2.0] Toro Canyon--->[ 6.0 , 2.0] + + + + + + + + + + + + + + @ @ @ + + + + + + @ @ @ @ @ + + + + @ @ @ @ @ @ @ + + + @ @ @ @ @ @ @ + + + @ @ @ @ @ @ @ + + + + @ @ @ @ @ + + + + + + @ @ @ + + + + + + + + + + + + + + + + + + + + + + +
Introduction
Hadoop Map-Reduce is a YARN-based system for parallel processing of large data sets. If you are new to hadoop, first visit here. In this article, I will help you quickly start with writing the simplest Map-Reduce job. This is a famous “Wordcount” MR job and the first one for 90% of the people (if not more).
WordCount is a simple application that counts the number of occurences of each word in a given input set.
This code example is from MapReduce tutorial available here. You can checkout source code directly from this small Github project I created.
Step 1. Install and start Hadoop server
In this tutorial, I assume your hadoop installation is ready. For Single Node setup, visit here.
Start Hadoop:
amresh@ubuntu:/home/amresh$ cd /usr/local/hadoop/ amresh@ubuntu:/usr/local/hadoop-1.0.2$ bin/start-all.sh amresh@ubuntu:/usr/local/hadoop-1.0.2$ sudo jps 6098 JobTracker 8024 Jps 5783 DataNode 5997 SecondaryNameNode 5571 NameNode 6310 TaskTracker
(Make sure NameNode, DataNode, JobTracker, TaskTracker, SecondaryNameNode are running)
If NameNode is not running, try formatting it and restart Hadoop.
amresh@ubuntu:/usr/local/hadoop-1.0.2$ bin/hadoop namenode -format amresh@ubuntu:/usr/local/hadoop-1.0.2$ bin/stop-all.sh amresh@ubuntu:/usr/local/hadoop-1.0.2$ bin/start-all.sh
Step 2. Write Map-Reduce Job for Wordcount
Map.java (Mapper Implementation)
package com.impetus.code.examples.hadoop.mapred.wordcount;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;
public class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable>
{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter)
throws IOException
{
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens())
{
word.set(tokenizer.nextToken());
output.collect(word, one);
}
}
}
Reduce.java (Reducer Implementation)
package com.impetus.code.examples.hadoop.mapred.wordcount;
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;
public class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable>
{
public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output,
Reporter reporter) throws IOException
{
int sum = 0;
while (values.hasNext())
{
sum += values.next().get();
}
output.collect(key, new IntWritable(sum));
}
}
WordCount.java (Job)
package com.impetus.code.examples.hadoop.mapred.wordcount;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.TextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;
public class WordCount
{
public static void main(String[] args) throws Exception
{
JobConf conf = new JobConf(WordCount.class);
conf.setJobName("wordcount");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setCombinerClass(Reduce.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
JobClient.runJob(conf);
}
}
Step 3. Compile and Create Jar file
I prefer maven for building my java project. You can find POM file here and add to your java project. This will make sure you have Hadoop Jar dependency ready.
Just Run:
amresh@ubuntu:/usr/local/hadoop-1.0.2$ cd ~/development/hadoop-examples amresh@ubuntu:/home/amresh/development/hadoop-examples$ mvn clean install
Step 4. Create input files to copy words from
amresh@ubuntu:/usr/local/hadoop-1.0.2$ bin/hadoop dfs -mkdir ~/wordcount/input amresh@ubuntu:/usr/local/hadoop-1.0.2$ sudo vi file01 (Hello World Bye World) amresh@ubuntu:/usr/local/hadoop-1.0.2$ sudo vi file02 (Hello Hadoop Goodbye Hadoop) amresh@ubuntu:/usr/local/hadoop-1.0.2$ bin/hadoop dfs -copyFromLocal file01 /home/amresh/wordcount/input/ amresh@ubuntu:/usr/local/hadoop-1.0.2$ bin/hadoop dfs -copyFromLocal file02 /home/amresh/wordcount/input/ amresh@ubuntu:/usr/local/hadoop-1.0.2$ bin/hadoop dfs -ls /home/amresh/wordcount/input/ Found 2 items -rw-r--r-- 1 amresh supergroup 0 2012-05-08 14:51 /home/amresh/wordcount/input/file01 -rw-r--r-- 1 amresh supergroup 0 2012-05-08 14:51 /home/amresh/wordcount/input/file02
Step 5. Run Map-Reduce job you wrote
amresh@ubuntu:/usr/local/hadoop-1.0.2$ bin/hadoop jar ~/development/hadoop-examples/target/hadoop-examples-1.0.jar com.impetus.code.examples.hadoop.mapred.wordcount.WordCount /home/amresh/wordcount/input /home/amresh/wordcount/output amresh@ubuntu:/usr/local/hadoop-1.0.2$ bin/hadoop dfs -ls /home/amresh/wordcount/output/ Found 3 items -rw-r--r-- 1 amresh supergroup 0 2012-05-08 15:23 /home/amresh/wordcount/output/_SUCCESS drwxr-xr-x - amresh supergroup 0 2012-05-08 15:22 /home/amresh/wordcount/output/_logs -rw-r--r-- 1 amresh supergroup 41 2012-05-08 15:23 /home/amresh/wordcount/output/part-00000 amresh@ubuntu:/usr/local/hadoop-1.0.2$ bin/hadoop dfs -cat /home/amresh/wordcount/output/part-00000 Bye 1 Goodbye 1 Hadoop 2 Hello 2 World 2
When it comes to building rich UI web-application in Java, I always count on JSF. I know there are many who hate JSF, but this article is not about getting into those fights
As a starter, I always spent hours in resolving Jar dependencies and configuration issues. Many times they are tough to crack as error messages are not very helpful. Problem is aggravated when you use popular component libraries like Richfaces and Primefaces, as you are often not sure where that annoying error message is coming from.
This article doesn’t help you learn JSF. I assume you may already have read a few Hello World kind of JSF examples somewhere on internet and you know basics. My intent is to help you with making a simple JSF webpage quickly and with ease, without worrying about those Jar dependencies.
I am fond of both Richfaces and Primefaces component libraries, and hence have included both of them into my example. You can use their combination, any one or none of them…your pick.
Prerequisite
We are using maven tool for specifying jar dependencies and building our project. Eclipse will be used as IDE and Tomcat 6 as web container.
We’ll be using JSF version 2.1.10.
Build Maven Project
Download and configure maven on your machine, if not done already.
1. Create a maven project
amresh@ubuntu:~/development$ mvn archetype:generate -DgroupId=com.xamry.jsfapp -DartifactId=jsf-webapp -DarchetypeArtifactId=maven-archetype-webapp amresh@ubuntu:~/development$ cd jsf-webapp/ amresh@ubuntu:~/development/jsf-webapp$ ls -ltr total 8 -rw-rw-r-- 1 amresh amresh 712 Aug 28 22:46 pom.xml drwxrwxr-x 3 amresh amresh 4096 Aug 28 22:46 src amresh@ubuntu:~/development/jsf-webapp$ cd src/main/ amresh@ubuntu:~/development/jsf-webapp/src/main$ ls resources webapp amresh@ubuntu:~/development/jsf-webapp/src/main$ mkdir java amresh@ubuntu:~/development/jsf-webapp/src/main$ cd ../..
2. Generate Eclipse settings
amresh@ubuntu:~/development/jsf-webapp$ mvn eclipse:clean eclipse:eclipse
3. Import Project into Eclipse
Eclipse ->File ->Imprt -> Existing Project into Workspace -> (Select root directory and then jsf-webapp) -> OK
4. Edit pom.xml
Copy and paste, below code into pom.xml under your project root folder:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.xamry.jsfapp</groupId>
<artifactId>jsf-webapp</artifactId>
<packaging>war</packaging>
<version>1.0-SNAPSHOT</version>
<name>JSF Web Application</name>
<url>http://maven.apache.org</url>
<properties>
<org.richfaces.bom.version>4.1.0.Final</org.richfaces.bom.version>
</properties>
<repositories>
<!-- Primeface repository -->
<repository>
<id>prime-repo</id>
<name>PrimeFaces Maven Repository</name>
<url>http://repository.primefaces.org</url>
<layout>default</layout>
</repository>
</repositories>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.richfaces</groupId>
<artifactId>richfaces-bom</artifactId>
<version>${org.richfaces.bom.version}</version>
<scope>import</scope>
<type>pom</type>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<!-- JSF Dependencies -->
<dependency>
<groupId>com.sun.faces</groupId>
<artifactId>jsf-api</artifactId>
<version>2.1.10</version>
</dependency>
<dependency>
<groupId>com.sun.faces</groupId>
<artifactId>jsf-impl</artifactId>
<version>2.1.10</version>
</dependency>
<dependency>
<groupId>jstl</groupId>
<artifactId>jstl</artifactId>
<version>1.2</version>
</dependency>
<!-- Richfaces dependencies -->
<dependency>
<groupId>org.richfaces.ui</groupId>
<artifactId>richfaces-components-ui</artifactId>
</dependency>
<dependency>
<groupId>org.richfaces.core</groupId>
<artifactId>richfaces-core-impl</artifactId>
</dependency>
<!-- Primefaces dependencies -->
<dependency>
<groupId>org.primefaces</groupId>
<artifactId>primefaces</artifactId>
<version>3.3.1</version>
</dependency>
<!-- Common Dependencies -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<finalName>jsf-webapp</finalName>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.5.1</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-war-plugin</artifactId>
<configuration>
<archive>
<manifestEntries>
<Dependencies>org.slf4j</Dependencies>
</manifestEntries>
</archive>
</configuration>
</plugin>
</plugins>
</build>
</project>
Configuration Files
WEB-INF/web.xml
<?xml version="1.0" encoding="UTF-8"?> <web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" id="WebApp_ID" version="2.5"> <display-name>JSF Webapp</display-name> <context-param> <param-name>javax.faces.STATE_SAVING_METHOD</param-name> <param-value>server</param-value> </context-param> <context-param> <param-name>javax.faces.CONFIG_FILES</param-name> <param-value>/WEB-INF/faces-config.xml</param-value> </context-param> <context-param> <param-name>org.richfaces.SKIN</param-name> <param-value>blueSky</param-value> </context-param> <context-param> <param-name>org.richfaces.CONTROL_SKINNING</param-name> <param-value>enable</param-value> </context-param> <listener> <listener-class>com.sun.faces.config.ConfigureListener</listener-class> </listener> <!-- Faces Servlet --> <servlet> <servlet-name>Faces Servlet</servlet-name> <servlet-class>javax.faces.webapp.FacesServlet</servlet-class> <load-on-startup>1</load-on-startup> </servlet> <servlet-mapping> <servlet-name>Faces Servlet</servlet-name> <url-pattern>*.jsf</url-pattern> </servlet-mapping> </web-app>
WEB-INF/faces-config.xml
<?xml version="1.0" encoding="UTF-8"?> <faces-config xmlns="http://java.sun.com/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-facesconfig_2_1.xsd" version="2.1"> </faces-config>
XHTMLs
I know I am jumping a bit far, but I prefer creating JSF pages with templates. I am going to create two XHTML pages, one showing Richfaces panel and another showing Primefaces panel. You are free to use component of your choice. Each page will have a template file and a content file. Here it goes:
Richfaces XHTML Example:
xhtml/richfacesPageTemplate.xhtml
<!DOCTYPE html> <html lang="en" xmlns="http://www.w3.org/1999/xhtml" xmlns:f="http://java.sun.com/jsf/core" xmlns:h="http://java.sun.com/jsf/html" xmlns:ui="http://java.sun.com/jsf/facelets"> <h:head> <title><ui:insert name="title">Default title</ui:insert> </title> </h:head> <h:body> <table width="100%"> <tr> <td align="center"><ui:insert name="content">Default content</ui:insert> </td> </tr> </table> </h:body> </html>
xhtml/richfacesPage.xhtml
<ui:composition xmlns="http://www.w3.org/1999/xhtml" template="richfacesPageTemplate.xhtml" xmlns:h="http://java.sun.com/jsf/html" xmlns:f="http://java.sun.com/jsf/core" xmlns:ui="http://java.sun.com/jsf/facelets" xmlns:a4j="http://richfaces.org/a4j" xmlns:rich="http://richfaces.org/rich"> <ui:define name="title"> Richfaces - Page Example </ui:define> <ui:define name="content"> <table> <tr> <td><rich:messages styleClass="errorMessage" /> </td> </tr> </table> <br /> <rich:panel header="Richfaces - Panel"> Hi, This is an example richfaces panel </rich:panel> </ui:define> </ui:composition>
Primefaces XHTML Example:
xhtml/primefacesPageTemplate.xhtml
<!DOCTYPE html> <html lang="en" xmlns="http://www.w3.org/1999/xhtml" xmlns:f="http://java.sun.com/jsf/core" xmlns:h="http://java.sun.com/jsf/html" xmlns:ui="http://java.sun.com/jsf/facelets" xmlns:p="http://primefaces.org/ui"> <f:view contentType="text/html"> <h:head> <title><ui:insert name="title">Default title</ui:insert> </title> </h:head> <h:body> <table width="100%"> <tr> <td align="center"><ui:insert name="content">Default content</ui:insert> </td> </tr> </table> </h:body> </f:view> </html>
xhtml/primefacesPage.xhtml
<ui:composition xmlns="http://www.w3.org/1999/xhtml" template="primefacesPageTemplate.xhtml" xmlns:h="http://java.sun.com/jsf/html" xmlns:f="http://java.sun.com/jsf/core" xmlns:ui="http://java.sun.com/jsf/facelets" xmlns:p="http://primefaces.org/ui"> <ui:define name="title"> Primefaces - Page Example </ui:define> <ui:define name="content"> <table> <tr> <td><p:growl id="growl" sticky="false" showDetail="false" /></td> </tr> </table> <br /> <p:panel header="Primefaces - Panel"> This is a Primefaces panel </p:panel> </ui:define> </ui:composition>
Test your work
At the end, your folder structure should look like this in Eclipse:
1. Create War file and copy to Tomcat
amresh@ubuntu:~/development/jsf-webapp$ mvn clean install amresh@ubuntu:~/development/jsf-webapp$ cp target/jsf-webapp.war /usr/local/apache-tomcat-6.0.32/webapps/
2. Start Tomcat
amresh@ubuntu:~/development/jsf-webapp$ cd /usr/local/apache-tomcat-6.0.32/bin amresh@ubuntu:/usr/local/apache-tomcat-6.0.32/bin$ ./catalina.sh run
3. Hit web-pages
http://localhost:8080/jsf-webapp/xhtml/primefacesPage.jsf
http://localhost:8080/jsf-webapp/xhtml/richfacesPage.jsf
They should look similar to below screenshots:
Introduction
If you reached this page, it’s fair to assume that you must have worked on at least one relational database in your lifetime. They have been in use for a quarter of a century and are found in almost all business applications.

(source: http://joyreactor.com/tag/nosql)
But, NoSQL databases are gaining traction these days. they are often called “Not only SQL” databases. It’s an umbrella term for a loosely defined class of non-relational data-stores.
They exhibit following main characteristics:
- They don’t use SQL as their query language.
- They may not give full ACID guarantees.
- They have distributed, fault-tolerant architecture.
In this article, I am going to explore whether ORM tools (whatever they are) make sense in NoSQL world…and whether they will be able to solve problems that are NoSQL specific. Next we’ll delve into approaches and challenges in making such a tool. This article assumes you are already familiar with and have worked on one (and possibly more) NoSQL database.
ORM Solutions
ORM (Object Relational Mapping) solutions came into existence to solve OO-impedance mismatching problem. Most popular among them are Hibernate, Toplink, EclipseLink etc. They worked beautifully with relational databases like Oracle and MySQL, among others.
Each ORM solution had its own API and object query language (like HQL for hibernate) which made it difficult for programmers to switch from one framework to another. As a result, efforts were made to make standards and specifications. Most popular ORM Standards are:
- EJB – Enterprise Java Beans (Entity Beans to be specific)
- JPA – Java Persistence API
- JDO – Java Data Objects
- SDO – Service Data Objects
Problems in working with NoSQL
Problem with NoSQL databases is that there is NOT EVEN ONE existing industry standard (like SQL) for them. The very basic idea of “something opposed to SQL”…and as a result – deviation from standards and rules, is going to be suicidal, if not corrected at right time. Learning to work with a new NoSQL database is always cumbersome as a result.
Apart from that, people lack in-depth knowledge of NoSQL. Even if they do, they are confined to one or two. In relational world, people depend upon their knowledge of SQL and JDBC to work on basic and intermediate database things. Switching to another database requires little or almost no effort, which otherwise is painful in NoSQL world.
ORM for NoSQL?
ORM for NoSQL is a bit mis-leading term. People prefer to call it “OM tool for NoSQL” or maybe “ODM – Object data-store Mapping tool”. ORM frameworks have already been there for 30+ years and it’s a de-facto industry standard. People are very clear about what ORM tools are supposed to do. There are no surprises.
Key here is to let people forget worrying about complexities inherent in NoSQLs. Let them do things in a way they already know and are comfortable with. Why not use an approach that is there for this problem domain for decades and has proven its usefulness.
A good use case advocating use of ORM tools is migration of applications (built using ORM tool) from RDBMS to NoSQL database. (or even from one NoSQL database to another). This requires (at least in theory) little or no programming effort in business domain.
Challenges in Making ORM Solution
Here are some real challenges that ORM solution providers are going to face:
- Making one solution for many common-looking problems requires some really tough generalizations and abstractions. Design is going to be a challenge as abstractions sometimes snatch away (or at least makes it difficult to put in front) many powers of NoSQL.
- NoSQL databases are built for performance and scalability. ORM tools have to employ techniques that don’t put too much burden on performance which users would otherwise get while using plain vanilla driver available with the database.
- Many ORM standards (like JPA and JDO) were written for relational databases that don’t always provide ways of doing certain things in NoSQL. In fact, most of the ORM tools (like hibernate) were built to solve only 80% of the frequently used mapping problems. Low level tweaking may prove to be crucial for applications built on NoSQL databases. Challenge here is to provide developers hooks for solving remaining 20% of the mapping problem.
- Each NoSQL database has its own architecture and network topology. They make specific assumptions in data center, racks and disks connectivity and how data is stored and distributed on them. In simple terms – NoSQL databases are “born distributed”. It’s a real challenge to provide all these configuration using ORM specifications.
Approach
There are some guidelines worth sharing, that helps develop ORM tools for NoSQL:
- Map notations in ORM framework to structures in NoSQL that they are most likely to be thought related to. Example is @Embedded in JPA with Super column in Cassandra.
- If framework provides a way to operate on database directly, leverage them. Example is mapping Native queries in JPA with CQL in Cassandra.
- NoSQL is built for performance. noticeable overhead over plain drivers is going to be a big turn-off for users. Make special efforts to keep overhead of ORM to minimum even if that requires sleepless nights and weeks of design-discussion fights.
Other Benefits
There are many other benefits of using a standard ORM tool over plain low-level driver library:
- Ease of use, faster development, increased productivity.
- NoSQL databases don’t provide mechanism to maintain relationship between tables. In real life though, business objects or entities do have relationship among them. ORM solutions may allow you to define these relationships in business objects and handle their storage and retrieval behind the scene.
- ORM solutions (like Kundera and SpringData) may provide polyglot persistence transparently that would otherwise be impossible. This may prove to be a boon for complex application requiring storage in multiple databases.
- Most NoSQL databases lack transaction management capabilities as they were not built that way (and because they were built to solve problem that didn’t require it at the first place). Many ORM specifications (like JPA) mandate this capability. If by any chance your application requires likes of atomicity, they are going to be your rescuer.
As it happens with everything in life, Discipline and Rules that you promise to yourself, pay in the longer run…Those NoSQLs that provide best of balance in features and ease of use are going to be successful eventually. ORM tools could be one facilitator in that pursuit.
References
1.
http://java.dzone.com/articles/martin-fowler-orm-hate
2.
http://architects.dzone.com/news/non-sense-nosql-orm-frameworks
3.
https://github.com/impetus-opensource/Kundera
5.
http://architects.dzone.com/news/non-sense-nosql-orm-frameworks
6.
http://martinfowler.com/bliki/PolyglotPersistence.html
Introduction
A composite key consists of one or more primary key fields. Each field must be of data type supported by underlying data-store.
In JPA (Java Persistence API), there are two ways of specifying composite keys:
1. Composite Primary Key:
@Entity
@IdClass(TimelineId.class)
public class Timeline {
@Id int userId;
@Id long tweetId;
//Other non-primary key fields
}
Class TimelineId {
int userId;
long tweetId;
}
2. Embedded Primary Key:
@Entity
public class Timeline {
@EmbeddedId TimelineId id;
//Other non-primary key fields
}
@Embeddable
Class TimelineId {
int userId;
long tweetId;
}
Above Timeline entity is inspired from famous twissandra example. Starting 1.1 release, Cassandra supports composite keys.
Cassandra Composite Keys in Action
Visit this page in order to understand Cassandra Schema in general. In this section I will give you a feel of how composite keys are stored in Cassandra.
Let’s start Cassandra 1.1.x server and run following commands from Cassandra/bin directory:
CQL:
./cqlsh -3 localhost 9160
CREATE KEYSPACE twissandra with strategy_class = 'SimpleStrategy' and strategy_options:replication_factor=1;
use twissandra;
CREATE TABLE timeline(
user_id varchar,
tweet_id varchar,
tweet_device varchar,
author varchar,
body varchar,
PRIMARY KEY(user_id,tweet_id,tweet_device));
INSERT INTO timeline (user_id, tweet_id, tweet_device, author, body) VALUES ('xamry', 't1', 'web', 'Amresh', 'Here is my first tweet');
INSERT INTO timeline (user_id, tweet_id, tweet_device, author, body) VALUES ('xamry', 't2', 'sms', 'Saurabh', 'Howz life Xamry');
INSERT INTO timeline (user_id, tweet_id, tweet_device, author, body) VALUES ('mevivs', 't1', 'iPad', 'Kuldeep', 'You der?');
INSERT INTO timeline (user_id, tweet_id, tweet_device, author, body) VALUES ('mevivs', 't2', 'mobile', 'Vivek', 'Yep, I suppose');
cqlsh:twissandra> select * from timeline;
user_id | tweet_id | author | body
---------+----------+---------+------------------------
xamry | t1 | Amresh | Here is my first tweet
xamry | t2 | Saurabh | Howz life Xamry
mevivs | t1 | Kuldeep | You der?
mevivs | t2 | Vivek | Yep, I suppose
cqlsh:twissandra> SELECT * FROM timeline WHERE user_id='xamry';
user_id | tweet_id | tweet_device | author | body
---------+----------+--------------+---------+------------------------
xamry | t1 | web | Amresh | Here is my first tweet
xamry | t2 | sms | Saurabh | Howz life Xamry
cqlsh:twissandra> select * from timeline where tweet_id = 't1';
user_id | tweet_id | tweet_device | author | body
---------+----------+--------------+---------+------------------------
xamry | t1 | web | Amresh | Here is my first tweet
mevivs | t1 | iPad | Kuldeep | You der?
cqlsh:twissandra> select * from timeline where user_id = 'xamry' and tweet_id='t1';
user_id | tweet_id | tweet_device | author | body
---------+----------+--------------+--------+------------------------
xamry | t1 | web | Amresh | Here is my first tweet
cqlsh:twissandra> select * from timeline where user_id = 'xamry' and author='Amresh';
Bad Request: No indexed columns present in by-columns clause with Equal operator
cqlsh:twissandra> select * from timeline where user_id = 'xamry' and tweet_device='web';
Bad Request: PRIMARY KEY part tweet_device cannot be restricted (preceding part tweet_id is either not restricted or by a non-EQ relation)
cqlsh:twissandra> select * from timeline where user_id = 'xamry' and tweet_id = 't1' and tweet_device='web';
user_id | tweet_id | tweet_device | author | body
---------+----------+--------------+--------+------------------------
xamry | t1 | web | Amresh | Here is my first tweet
Cassandra-cli:
impadmin@impetus-ubuntu:/usr/local/apache-cassandra-1.1.2/bin$ ./cassandra-cli -h localhost -p 9160 Connected to: "Test Cluster" on localhost/9160 Welcome to Cassandra CLI version 1.1.2 Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [default@unknown] use twissandra; Authenticated to keyspace: twissandra [default@twissandra] list timeline; <pre>Using default limit of 100 Using default column limit of 100 ------------------- RowKey: xamry => (column=t1:web:author, value=Amresh, timestamp=1343729388951000) => (column=t1:web:body, value=Here is my first tweet, timestamp=1343729388951001) => (column=t2:sms:author, value=Saurabh, timestamp=1343729388973000) => (column=t2:sms:body, value=Howz life Xamry, timestamp=1343729388973001) ------------------- RowKey: mevivs => (column=t1:iPad:author, value=Kuldeep, timestamp=1343729388991000) => (column=t1:iPad:body, value=You der?, timestamp=1343729388991001) => (column=t2:mobile:author, value=Vivek, timestamp=1343729389941000) => (column=t2:mobile:body, value=Yep, I suppose, timestamp=1343729389941001)
Observations
1. First part of composite key (user_id) is called “Partition Key”, rest (tweet_id, tweet_device) are remaining keys.
2. Cassandra stores columns differently when composite keys are used. Partition key becomes row key. Remaining keys are concatenated with each column name (“:” as separator) to form column names. Column values remain unchanged.
3. Remaining keys (other than partition keys) are ordered, and it’s not allowed to search on any random column, you have to start with the first one and then you can move to the second one and so on. This is evident from “Bad Request” error.
Introduction
Kundera is a powerful JPA based object-datastore mapping library (ORM equivalent) for NoSQL databases. It significantly reduced programming effort required for performing CRUD operations in NoSQL databases. Kundera currently supports Cassandra, HBase, MongoDB and relational databases.
Cross-datastore persistence is the latest additions to it feather. If your business objects are distributed across multiple databases, all you have to do is create entity classes, their relationship and specify which database you want them to be stored into. You perform CRUD operations on them using JPA API and rest is taken care of by Kundera. It automatically stores/ searches different entities into/ from their intended datastores.
Kundera doesn’t support JoinByPrimaryKeyColumn as of now, It’s a proposed feature that we are planning to take up in next releases.
If you want to quickly get started with Kundera, visit this link on Kundera wiki. You can download latest release of Kundera from here.
In this post, I will take you through this exciting journey. I am taking a simple example of two entities, namely PERSON and ADDRESS, to be stored into MySQL and Cassandra respectively. You can however choose any combination (Cassandra + HBase, MongoDB + MySQL, HBase + MongoDB etc.
Kundera supports both Unidirectional and Bidirectional associations. I’ll take following associations in this post:
- Unidirectional (OneToOne, OneToMany, ManyToOne, ManyToMany)
- Bidirectional (OneToOne, OneToMany, ManyToOne, ManyToMany)
Configuration
persistence.xml
<persistence xmlns="http://java.sun.com/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/persistence http://java.sun.com/xml/ns/persistence/persistence_2_0.xsd" version="2.0">
<persistence-unit name="mysqlPU">
<provider>com.impetus.kundera.KunderaPersistence</provider>
<class>com.impetus.kundera.examples.entities.Person</class>
<class>com.impetus.kundera.examples.entities.Address</class>
<properties>
<property name="kundera.client.lookup.class" value="com.impetus.client.rdbms.RDBMSClientFactory" />
<property name="hibernate.show_sql" value="true" />
<property name="hibernate.format_sql" value="true" />
<property name="hibernate.dialect" value="org.hibernate.dialect.MySQL5Dialect" />
<property name="hibernate.connection.driver_class" value="com.mysql.jdbc.Driver" />
<property name="hibernate.connection.url" value="jdbc:mysql://localhost:3306/hibernatepoc" />
<property name="hibernate.connection.username" value="root" />
<property name="hibernate.connection.password" value="impetus" />
<property name="hibernate.current_session_context_class" value="org.hibernate.context.ThreadLocalSessionContext" />
</properties>
</persistence-unit>
<persistence-unit name="cassandraPU">
<provider>com.impetus.kundera.KunderaPersistence</provider>
<properties>
<property name="kundera.nodes" value="localhost" />
<property name="kundera.port" value="9160" />
<property name="kundera.keyspace" value="KunderaKeyspace" />
<property name="kundera.dialect" value="cassandra" />
<property name="kundera.client.lookup.class" value="com.impetus.client.cassandra.pelops.PelopsClientFactory" />
<property name="kundera.cache.provider.class" value="com.impetus.kundera.cache.ehcache.EhCacheProvider" />
<property name="kundera.cache.config.resource" value="/ehcache-test.xml" />
</properties>
</persistence-unit>
</persistence>
Unidirectional Association
OneToOne Relationship
Database Table Structure:
PERSON (PERSON_ID, PERSON_NAME, p_website, p_email, p_yahoo_id, ADDRESS_ID)
ADDRESS(ADDRESS_ID, STREET)
Entity Definitions:
Person Entity:
import javax.persistence.CascadeType;
import javax.persistence.Column;
import javax.persistence.Embedded;
import javax.persistence.Entity;
import javax.persistence.FetchType;
import javax.persistence.Id;
import javax.persistence.JoinColumn;
import javax.persistence.OneToOne;
import javax.persistence.Table;
@Entity
@Table(name="PERSON", schema="mysqlschema")
public class Person {
@Id
@Column(name="PERSON_ID")
private String personId;
@Column(name="PERSON_NAME")
private String personName;
@Embedded
PersonalData personalData;
@OneToOne(cascade=CascadeType.ALL, fetch=FetchType.LAZY)
@JoinColumn(name="ADDRESS_ID")
private Address address;
//Constructors and getters/ setters omitted
}
PersonalData Embedded object:
import javax.persistence.Column;
import javax.persistence.Embeddable;
@Embeddable
public class PersonalData
{
@Column(name = "p_website")
private String website;
@Column(name = "p_email")
private String email;
@Column(name = "p_yahoo_id")
private String yahooId;
//Constructors and getters/ setters ommitted
}
Address Entity:
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;
@Entity
@Table(name="ADDRESS", schema="KunderaKeyspace@CassandraPU")
public class Address
{
@Id
@Column(name = "ADDRESS_ID")
private String addressId;
@Column(name = "STREET")
private String street;
//Constructors and getters/ setters omitted
}
DB Operation using Kundera:
import javax.persistence.Persistence;
import javax.persistence.EntityManagerFactory;
import javax.persistence.EntityManager;
import javax.persistence.Query;
//Persist Person entity
Person person = new Person();
person.setPersonId("1");
person.setPersonName("John Smith");
person.setPersonalData(new PersonalData("www.johnsmith.com", "john.smith@gmail.com", "jsmith"));
Address address = new Address();
address.setAddressId("111");
address.setStreet("123, New street");
person.setAddress(address);
EntityManagerFactory emf = Persistence.createEntityManagerFactory("mysqlPU,CassandraPU");
EntityManager em = emf.createEntityManager();
em.persist(person);
//Find Person Entity
Person p = em.find(Person.class, "1");
//Run JPA Query
Query q = em.createQuery("select p from Person p");
List<?> persons = q.getResultList();
em.close();
emf.close();
}
OneToMany Relationship
Database Table Structure:
PERSON (PERSON_ID, PERSON_NAME, p_website, p_email, p_yahoo_id)
ADDRESS(ADDRESS_ID, STREET, PERSON_ID)
Entity Definitions:
Person Entity:
//Imports here
@Entity
@Table(name = "PERSON", schema = "mysqlschema")
public class Person {
@Id
@Column(name = "PERSON_ID")
private String personId;
@Column(name = "PERSON_NAME")
private String personName;
@Embedded
PersonalData personalData;
@OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY)
@JoinColumn(name="PERSON_ID")
private Set<Address> addresses;
//Constructors, getters and setters omitted
}
PersonalData Embedded object:
Same as above.
Address Entity:
//Imports here
@Entity
@Table(name="ADDRESS", schema="KunderaKeyspace@CassandraPU")
public class Address
{
@Id
@Column(name = "ADDRESS_ID")
private String addressId;
@Column(name = "STREET")
private String street;
}
DB Operation using Kundera:
Person person = new Person();
person.setPersonId("1");
person.setPersonName("John Smith");
person.setPersonalData(new PersonalData("www.johnsmith.com", "john.smith@gmail.com", "jsmith"));
Set<Address> addresses = new HashSet<Address>();
Address address1 = new Address();
address1.setAddressId("111");
address1.setStreet("123, Old street");
Address address2 = new Address();
address2.setAddressId("222");
address2.setStreet("456, New street");
addresses.add(address1);
addresses.add(address2);
person.setAddresses(addresses);
EntityManagerFactory emf = Persistence.createEntityManagerFactory("mysqlPU,CassandraPU");
EntityManager em = emf.createEntityManager();
//Save Person entity
em.persist(person);
//Find Person Entity
Person p = em.find(Person.class, "1");
//Run JPA Query
Query q = em.createQuery("select p from Person p");
List<?> persons = q.getResultList();
em.close();
emf.close();
}
ManyToOne Relationship
Database Table Structure:
PERSON (PERSON_ID, PERSON_NAME, p_website, p_email, p_yahoo_id, ADDRESS_ID)
ADDRESS(ADDRESS_ID, STREET)
Entity Definitions:
Person Entity:
//Imports here
@Entity
@Table(name = "PERSON", schema = "mysqlschema")
public class Person {
@Id
@Column(name = "PERSON_ID")
private String personId;
@Column(name = "PERSON_NAME")
private String personName;
@Embedded
private PersonalData personalData;
@ManyToOne(cascade = CascadeType.ALL, fetch = FetchType.LAZY)
@JoinColumn(name="ADDRESS_ID")
private Address address;
//Constructor, getters, setters here
}
PersonalData Embedded object:
Same as above.
Address Entity:
//Imports here
@Entity
@Table(name="ADDRESS", schema="KunderaKeyspace@CassandraPU")
public class Address {
@Id
@Column(name = "ADDRESS_ID")
private String addressId;
@Column(name = "STREET")
private String street;
//Constructors, getters, setters here
}
DB Operation using Kundera:
Person person1 = new Person();
person1.setPersonId("1");
person1.setPersonName("John Smith");
person1.setPersonalData(new PersonalData("www.johnsmith.com", "john.smith@gmail.com", "jsmith"));
Person person2 = new Person();
person2.setPersonId("2");
person2.setPersonName("Patrick Wilson");
person2.setPersonalData(new PersonalData("www.patrickwilson.com", "patrick.wilson@gmail.com", "pwilson"));
Address address = new Address();
address.setAddressId("111");
address.setStreet("123, Old street");
person1.setAddress(address);
person2.setAddress(address);
Set<Person> persons = new HashSet<Person>();
persons.add(person1);
persons.add(person2);
EntityManagerFactory emf = Persistence.createEntityManagerFactory("mysqlPU,CassandraPU");
EntityManager em = emf.createEntityManager();
//Save Person entities
for(Person person : persons) {
em.persist(person);
}
//Find Person Entities
Person p1 = em.find(Person.class, "1");
Address add1 = p1.getAddress();
Person p2 = em.find(Person.class, "2");
Address add2 = p2.getAddress();
//Run JPA Query
Query q = em.createQuery("select p from Person p");
List<?> ps = q.getResultList();
em.close();
emf.close();
}
ManyToMany Relationship
Database Table Structure:
PERSON (PERSON_ID, PERSON_NAME, p_website, p_email, p_yahoo_id) ADDRESS(ADDRESS_ID, STREET) PERSON_ADDRESS(PERSON_ID, ADDRESS_ID)
Entity Definitions:
Person Entity:
//Imports here
@Entity
@Table(name = "PERSON", schema = "mysqlschema")
public class Person {
@Id
@Column(name = "PERSON_ID")
private String personId;
@Column(name = "PERSON_NAME")
private String personName;
@Embedded
private PersonalData personalData;
@ManyToMany
@JoinTable(name = "PERSON_ADDRESS",
joinColumns = {
@JoinColumn(name="PERSON_ID")
},
inverseJoinColumns = {
@JoinColumn(name="ADDRESS_ID")
}
)
private Set<Address> addresses;
//Constructor, getters, setters here
}
PersonalData Embedded object:
Same as above.
Address Entity:
//Imports here
@Entity
@Table(name="ADDRESS", schema="KunderaKeyspace@CassandraPU")
public class Address {
@Id
@Column(name = "ADDRESS_ID")
private String addressId;
@Column(name = "STREET")
private String street;
//Constructors, getters, setters here
}
DB Operation using Kundera:
Person person1 = new Person();
person1.setPersonId("1");
person1.setPersonName("John Smith");
person1.setPersonalData(new PersonalData("www.johnsmith.com", "john.smith@gmail.com", "jsmith"));
Person person2 = new Person();
person2.setPersonId("2");
person2.setPersonName("Patrick Wilson");
person2.setPersonalData(new PersonalData("www.patrickwilson.com", "patrick.wilson@gmail.com", "pwilson"));
Address address1 = new Address();
address1.setAddressId("111");
address1.setStreet("123, Old street");
Address address2 = new Address();
address2.setAddressId("222");
address2.setStreet("456, New street");
Address address3 = new Address();
address3.setAddressId("333");
address3.setStreet("789, Forbidden street");
person1.addAddress(address1);
person1.addAddress(address2);
person2.addAddress(address2);
person3.addAddress(address3);
Set<Person> persons = new HashSet<Person>();
persons.add(person1);
persons.add(person2);
EntityManagerFactory emf = Persistence.createEntityManagerFactory("mysqlPU,CassandraPU");
EntityManager em = emf.createEntityManager();
//Save Person entities
for(Person person : persons) {
em.persist(person);
}
//Find Person Entities
Person p1 = em.find(Person.class, "1");
Set<Address> adds1 = p1.getAddresses();
Address address11 = adds1.get(0);
Set<Person> people1 = address1.getPeople();
Person p2 = em.find(Person.class, "2");
Set<Address> adds2 = p2.getAddresses();
Address address21 = adds2.get(0);
Set<Person> people2 = address21.getPeople();
//Run JPA Query
Query q = em.createQuery("select p from Person p");
List<?> ps = q.getResultList();
em.close();
emf.close();
Bidirectional Association
OneToOne Relationship
Database Table Structure:
PERSON (PERSON_ID, PERSON_NAME, p_website, p_email, p_yahoo_id, ADDRESS_ID)
ADDRESS(ADDRESS_ID, STREET)
Entity Definitions:
Person Entity:
@Entity
@Table(name = "PERSON", schema = "mysqlschema")
public class Person {
@Id
@Column(name = "PERSON_ID")
private String personId;
@Column(name = "PERSON_NAME")
private String personName;
@Embedded
private PersonalData personalData;
@OneToOne(cascade = CascadeType.ALL, fetch = FetchType.LAZY)
@JoinColumn(name="ADDRESS_ID")
private Address address;
//Constructors, getters, setters here
}
PersonalData Embedded object:
Same as above.
Address Entity:
@Entity
@Table(name="ADDRESS", schema="KunderaKeyspace@CassandraPU")
public class Address
{
@Id
@Column(name = "ADDRESS_ID")
private String addressId;
@Column(name = "STREET")
private String street;
@OneToOne(mappedBy="address")
private Person person;
//Constructors, getters, setters here
}
DB Operation using Kundera:
Person person = new Person();
person.setPersonId("1");
person.setPersonName("John Smith");
person.setPersonalData(new PersonalData("www.johnsmith.com", "john.smith@gmail.com", "jsmith"));
Address address = new Address();
address.setAddressId("111");
address.setStreet("123, old street");
person.setAddress(address);
EntityManagerFactory emf = Persistence.createEntityManagerFactory("mysqlPU,CassandraPU");
EntityManager em = emf.createEntityManager();
//Save Person entity
em.persist(person);
//Find Person Entity
Person p = em.find(Person.class, "1");
Address address = p.getAddress();
//Run JPA Query
Query q = em.createQuery("select p from Person p");
List<?> persons = q.getResultList();
em.close();
emf.close();
OneToMany Relationship
Database Table Structure:
PERSON (PERSON_ID, PERSON_NAME, p_website, p_email, p_yahoo_id)
ADDRESS(ADDRESS_ID, STREET, PERSON_ID)
Entity Definitions:
Person Entity:
//Imports here
@Entity
@Table(name = "PERSON", schema = "mysqlschema")
public class Person {
@Id
@Column(name = "PERSON_ID")
private String personId;
@Column(name = "PERSON_NAME")
private String personName;
@Embedded
private PersonalData personalData;
@OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY, mappedBy="person")
private Set<Address> addresses;
//Constructors, Getters, setters here
}
PersonalData Embedded object:
Same as above.
Address Entity:
//Imports here
@Entity
@Table(name="ADDRESS", schema="KunderaKeyspace@CassandraPU")
public class Address
{
@Id
@Column(name = "ADDRESS_ID")
private String addressId;
@Column(name = "STREET")
private String street;
@ManyToOne(fetch = FetchType.LAZY)
@JoinColumn(name="PERSON_ID")
private Person person;
//Constructors, getters, setters here
}
DB Operation using Kundera:
Person person = new Person();
person.setPersonId("1");
person.setPersonName("John Smith");
person.setPersonalData(new PersonalData("www.johnsmith.com", "john.smith@gmail.com", "jsmith"));
Set<Address> addresses = new HashSet<Address>();
Address address1 = new Address();
address1.setAddressId("111");
address1.setStreet("123, Old Street");
Address address2 = new Address();
address2.setAddressId("222");
address2.setStreet("456, New Street");
addresses.add(address1);
addresses.add(address2);
person.setAddresses(addresses);
//Save Person entity
em.persist(person);
//Find Person Entity
Person p = em.find(Person.class, "1");
Set<Address> adds = p.getAddresses();
//Run JPA Query
Query q = em.createQuery("select p from Person p");
List<?> persons = q.getResultList();
em.close();
emf.close();
}
ManyToOne Relationship
Database Table Structure:
PERSON (PERSON_ID, PERSON_NAME, p_website, p_email, p_yahoo_id, ADDRESS_ID)
ADDRESS(ADDRESS_ID, STREET)
Entity Definitions:
Person Entity:
//Imports here
@Entity
@Table(name = "PERSON", schema = "mysqlschema")
public class Person {
@Id
@Column(name = "PERSON_ID")
private String personId;
@Column(name = "PERSON_NAME")
private String personName;
@Embedded
private PersonalData personalData;
@ManyToOne(cascade = CascadeType.ALL, fetch = FetchType.LAZY)
@JoinColumn(name="ADDRESS_ID")
private Address address;
//Constructors, getters, setters here
}
PersonalData Embedded object:
Same as above.
Address Entity:
//Imports here
@Entity
@Table(name="ADDRESS", schema="KunderaKeyspace@CassandraPU")
public class Address
{
@Id
@Column(name = "ADDRESS_ID")
private String addressId;
@Column(name = "STREET")
private String street;
@OneToMany(mappedBy="address", fetch = FetchType.LAZY)
private Set<Person> people;
//Constructors, getters, setters here
}
DB Operation using Kundera:
Person person1 = new Person();
person1.setPersonId("1");
person1.setPersonName("John Smith");
person1.setPersonalData(new PersonalData("www.johnsmith.com", "john.smith@gmail.com", "jsmith"));
Person person2 = new Person();
person2.setPersonId("2");
person2.setPersonName("Patrick Wilson");
person2.setPersonalData(new PersonalData("www.patrickwilson.com", "patrick.wilson@gmail.com", "pwilson"));
Address address = new Address();
address.setAddressId("111");
address.setStreet("123, Old street");
person1.setAddress(address);
person2.setAddress(address);
Set<Person> persons = new HashSet<Person>();
persons.add(person1);
persons.add(person2);
//Save Person entities
for(Person person : persons) {
em.persist(person);
}
//Find Person Entities
Person p1 = em.find(Person.class, "1");
Address add1 = p1.getAddress();
Set people1 = add1.getPeople();
Person p2 = em.find(Person.class, "2");
Address add2 = p2.getAddress();
Set people2 = add2.getPeople();
//Run JPA Query
Query q = em.createQuery("select p from Person p");
List<?> ps = q.getResultList();
em.close();
emf.close();
}
ManyToMany Relationship
Database Table Structure:
PERSON (PERSON_ID, PERSON_NAME, p_website, p_email, p_yahoo_id) ADDRESS (ADDRESS_ID, STREET) PERSON_ADDRESS (PERSON_ID, ADDRESS_ID)
Entity Definitions:
Person Entity:
//Imports here
@Entity
@Table(name = "PERSON", schema = "mysqlschema")
public class Person {
@Id
@Column(name = "PERSON_ID")
private String personId;
@Column(name = "PERSON_NAME")
private String personName;
@Embedded
private PersonalData personalData;
@ManyToMany
@JoinTable(name = "PERSON_ADDRESS",
joinColumns = {
@JoinColumn(name="PERSON_ID")
},
inverseJoinColumns = {
@JoinColumn(name="ADDRESS_ID")
}
)
private Set<Address> addresses;
//Constructors, getters, setters here
}
PersonalData Embedded object:
Same as above.
Address Entity:
//Imports here
@Entity
@Table(name="ADDRESS", schema="KunderaKeyspace@CassandraPU")
public class Address
{
@Id
@Column(name = "ADDRESS_ID")
private String addressId;
@Column(name = "STREET")
private String street;
@ManyToMany(mappedBy = "addresses", fetch = FetchType.LAZY)
private Set<Person> people;
//Constructors, getters, setters here
}
DB Operation using Kundera:
Person person1 = new Person();
person1.setPersonId("1");
person1.setPersonName("John Smith");
person1.setPersonalData(new PersonalData("www.johnsmith.com", "john.smith@gmail.com", "jsmith"));
Person person2 = new Person();
person2.setPersonId("2");
person2.setPersonName("Patrick Wilson");
person2.setPersonalData(new PersonalData("www.patrickwilson.com", "patrick.wilson@gmail.com", "pwilson"));
Address address = new Address();
address.setAddressId("111");
address.setStreet("123, Old street");
person1.setAddress(address);
person2.setAddress(address);
Set<Person> persons = new HashSet<Person>();
persons.add(person1);
persons.add(person2);
//Save Person entities
for(Person person : persons) {
em.persist(person);
}
//Find Person Entities
Person p1 = em.find(Person.class, "1");
Address add1 = p1.getAddress();
Set people1 = add1.getPeople();
Person p2 = em.find(Person.class, "2");
Address add2 = p2.getAddress();
Set people2 = add2.getPeople();
//Run JPA Query
Query q = em.createQuery("select p from Person p");
List<?> ps = q.getResultList();
em.close();
emf.close();
}
Conclusion
Applications, at times require data persistence in multiple databases (occasionally a combination of RDBMS and NoSQL).
Using low level database drivers and APIs, it requires considerable effort in persisting, retrieving and querying your related data into/ from multiple stores. Kundera solves this important problem by providing a simple, easy to use and cleaner interface. It hides complexities and maintains those relationships behind the scene. This is consistent with basic philosophy behind Kundera - “Working with NoSQL should be as easy and fun as RDBMS”.



