Seminar Topics & Project Ideas On Computer Science Electronics Electrical Mechanical Engineering Civil MBA Medicine Nursing Science Physics Mathematics Chemistry ppt pdf doc presentation downloads and Abstract

Full Version: VISUALIZING CLASS DIAGRAM USING OrientDB DATA-STORE
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
[attachment=74649]



Abstract— Relational databases are providing storage for several decades now. The term NoSQL broadly covers all non-relational databases that provide scalable and schemaless model. NoSQL databases are used by major organizations operating in the era of Web 2.0. Different categories of NoSQL databases are key-value pair, document, column-oriented and graph databases which enable programmers to visualize the data closer to the format used in their application. In this paper, class diagram has been merged with OrientDB through Java API to visualize the class diagram as OrientDB graph. OrientDB is the only database which supports both graph and document database, also provides support for both inheritance and polymorphism.

I. INTRODUCTION
NoSQL is a next generation [6] database which addresses these properties: non-relational, non-ACID[12], distributed, open-source and horizontally scalable , schema-less. The original intention has been modern web-scale databases. Rise of NoSQL [15] databases are challenging the dominance of relational databases (dominated the software industry for longer period). There is an impedance mismatch between the relational data structures and the in-memory data structures. A NoSQL database allows developers to develop without having to convert in-memory structures to relational structures.
Key-value is the simple NoSQL data stores to use from an API perspective. User can get the value for a key, put the value for a key or delete a key from the data store. The value is just stored without knowing what is stored inside. Key-value stores always use primary-key and use a hash table where a pointer points to a particular item of data. Key-based lookups results in lesser query execution time since values can be anything like objects, hashes etc. resulting in flexible and schema-less model appropriate for today’s unstructured data, hence gives a great performance and can be easily scaled. Key-value databases are Riak(Basho) [8], Redis(VMware) [8], Amazon DynamoDB [5], and Couchbase [15]. To update part of a value or query the database, this method is not ideal.
Documents are the prime concept in document databases. The database stores and retrieves documents which can be XML, JSON, BSON etc. These documents are self-describing and form hierarchical tree data structures which can consist of maps, collections, and scalar values. For example one could search for all documents in which “City” is “Patiala” that would deliver a result set containing all documents connected with any “3 Storey Office” that is in that particular city. Apache CouchDB[9] and MongoDB [15] are popular examples of a document store. CouchDB uses JSON to store data, JavaScripts its query language using MapReduce and HTTP for an API. MongoDB is designed to be able to face new challenges such as horizontal scalability, high-availability and flexibility to handle semi- structured data. MongoDB has typical applications in content management systems, mobiles, gaming and archiving. Document style databases are schema-less so it makes addition of fields easy to JSON without defining changes first.
In Column-oriented/ Wide-table data stores, data is stored in cells grouped in columns of data rather than as rows of data. Columns are logically grouped into column families. Column families hold inside a virtually unlimited number of columns that can be created at runtime or the definition of the schema. Read and write is performed using columns rather than rows. In comparison, most relational DBMS reserve data in rows, the benefit of storing data in columns is fast search, access and data aggregation [7]. Relational databases store a single row as a continuous disk entry. Different rows are stored in different regions on disk whereas Columnar databases store all the cells corresponding to a column as a continuous disk entry making the search and access faster. For example: To query the titles from a bunch of a million articles will be a heedful task while using relational databases as it will go over each location to get item titles. On other side, with just one disk access, title of all the items can be attained. Popular open source column-oriented databases are Hypertable [8], HBase [15] and Cassandra [15]. Hypertable and HBase are derivatives of BigTable where as Cassandra takes its features from both BigTable and Dynamo.
In Graph NoSQL Database, there is no rigid format of SQL, tables and columns representation, a flexible graphical design is instead used which is perfect to address scalability concerns. It does not require a pre-defined schema which leads to easier adaptation to schema evolution. It allows to store entities and relationships amid these entities. Relations are known as edges that can have properties. Edges [9] have directional significance and nodes are organized by relationships which allow you to find interesting patterns among the nodes. The graphs can be correlated and interpreted in different ways. Mostly, when we store a graph-like structure in RDBMS, it is for a single type of relationship, adding another relationship to the mix usually means a lot of schema changes and data movement which is easily done when graph database is used. In relational databases graph is modelled beforehand based on the Traversal one wants; if the Traversal changes, the data will be changed whereas in graph databases, traversing the joins or relationships is very fast [13]. The relationship between nodes is not calculated at query time but is actually prevailed as a relationship. Traversing persisted relationships is faster than calculating them for every query. Example: Social networking websites where relationships among data are as important as data itself are best candidates for graph-based storage. More than 20 graph databases are available of which few are proprietary and others open-source, popular ones are Neo4j [8], Titan [15], OrientDB [8], AllegroGraph [15], InfiniteGraph [15] etc.
OrientDB(Document+Graph) -OrientDB is a 2nd Generation Distributed Graph Database [10] and the first Multi-Model Open Source NoSQL DBMS that carries together the power of graphs and the flexibility of documents into one scalable high-performance [13] operational database. First generation Graph Databases lack the features that Big Data demands: multi-master replication[6], sharding[6] and more flexibility for modern complex use cases. OrientDB is incredibly fast as can store 220,000 records per second [10] on common hardware. Even for a Document based database, the relationships are handled as in Graph Databases with direct connections within records. You can traverse parts of or entire trees and graphs of records within few milliseconds. OrientDB supports schema-less, schema-full and schema-mixed modes [10], has a strong security profiling system based on roles and users, supports SQL amongst the query languages. Being a document database one can store any document on a vertex, being a graph database one can introduce new edges and properties. It allows schemas to be introduced at runtime.
Class diagram with OrientDB helps to store the state of the system. Almost for every system a domain model (or logical information model) is framed which describes what information the system must maintain. The state of the art for shaping such models is to build an object-oriented class diagram, typically in UML. This model represents classes with their properties and associations among classes. With most databases there is some impedance mismatch when mapping the canonical model as in Relational model there is no support for inheritance, polymorphism and relationships have to be mapped into keys. In graph databases there is also no support for polymorphism, inheritance and complex properties introduces new vertices. A document database also does not provide support for polymorphism and provides very limited support for relationships. In OrientDB the mapping eliminates all impedance mismatch as object becomes a vertex, complex properties are handled by documents, provides explicit support for relationships, inheritance and polymorphism.
I. MOTIVATION
To create a graph that contains a class diagram using the Java API OrientDB so that it becomes easy to analyse how the class diagram is depicted in graph database. It helps to easily understand how the classes are related to each other, how polymorphism is used. OrientDB graph is easily extensible and this can be achieved by adding information about methods or by purporting the meta-data. In this manner information can be managed about annotated methods and released revisions.


Projects need some extent of run-time configuration where user could configure the rules of the data structures stored and hence complexity increases. If a relational database is used behind any application then high complexity is seen between joins to retrieve the data. A schema-free document database could simplify complexity problem. OrientDB allows schemas to be introduced at runtime and provide record level security which means any record or class can be altered to extend any other - including vertices and edges in the graph.
To measure the performance based on response time of the retrieved data using JAVA API OrientDB.

I. NoSQL DATA-STORES
NoSQL referred to (not only SQL) or (non relational [3]) database which provides a mechanism for storage and retrieval of data modelled different from the tabular relations used in relational databases. They have existed since the late 1960s, but did not obtain the NoSQL signature until its popularity in the early twenty-first century when Web 2.0 companies such as Facebook, Google and Amazon.com [15]. NoSQL databases are widely used in Bigdata, real-time web [2] applications and also supports SQL like query languages.
DOCUMENT-ORIENTED DATA MODEL OR DOCUMENT STORE
Is designed for storing, retrieving, and managing document-oriented information also called as semi-structured data [3]. The popularity of the term document-oriented database has grown [1] with the help of the term NoSQL itself. XML databases [4] are a subclass of document-oriented databases. Document-oriented databases are intrinsically a subclass of the key-value store [4], another NoSQL database concept. Example the following document is encoded in JSON.



CONCLUSION
NoSQL is a complementary product for handling issues of scalability, complexity and performance. Non-relational databases provide many enhancements over traditional relational databases[11] such as increased scaling across commodity servers or cloud instances, non-adherence to rigid schema for inserting data and hence ease in capturing of different type data without many changes at schema level.
In this paper, we discussed about NoSQL databases and OrientDB which together provides better features by using both document and graph databases. Class diagram is very popular among application developers, but the concept together with non-relational databases is yet to come. To the best of our knowledge, there is no publication that explained class diagram using OrientDB and querying in it to retrieve and update the class diagram without affecting the other classes. Due to limit on length, only two classes of NoSQL Databases: Document-oriented and Graph-based databases together have been covered in this paper. A case-study have been explained and considered to illustrate the way of Class diagram. With the help of five queries data relationship between classes have been depicted through graph in OrientDB and their performance have been noted down.