4.2 Mapping entities with identity
Mapping entities with identity requires you to understand Java identity and equality before we can walk through an entity class example and its mapping. After that, we’ll be able to dig in deeper and select a primary key, configure key generators, and finally go through identifier generator strategies. First, it’s vital to understand the difference between Java object identity and object equality before we discuss terms like database identity and the way JPA manages identity.
4.2.1 Understanding Java identity and equality
Java developers understand the difference between Java object identity and equality. Object identity (==) is a notion defined by the Java virtual machine. Two references are identical if they point to the same memory location.
On the other hand, object equality is a notion defined by a class’s equals() method, sometimes also referred to as equivalence. Equivalence means two different (non-identical) instances have the same value—the same state. Two different instances of String are equal if they represent the same sequence of characters, even though each has its own location in the memory space of the virtual machine. (If you’re a Java guru, we acknowledge that String is a special case. Assume we used a different class to make the same point.)
Persistence complicates this picture. With object/relational persistence, a persistent instance is an in-memory representation of a particular row (or rows) of a database table (or tables). Along with Java identity and equality, we define database identity. You now have three methods for distinguishing references:
■ Objects are identical if they occupy the same memory location in the JVM. This can be checked with the a == b operator. This concept is known as object identity.
■ Objects are equal if they have the same state, as defined by the a.equals(Object b) method. Classes that don’t explicitly override this method inherit the implementation defined by java.lang.Object, which compares object identity with ==. This concept is known as object equality.
■ Objects stored in a relational database are identical if they share the same table and primary key value. This concept, mapped into the Java space, is known as database identity.
We now need to look at how database identity relates to object identity and how to express database identity in the mapping metadata. As an example, you’ll map an entity of a domain model.
4.2.2 A first entity class and mapping
We weren’t completely honest in the previous chapter: the @Entity annotation isn’t enough to map a persistent class. You also need an @Id annotation, as shown in the following listing.
@Entity
public class Item {
@Id
@GeneratedValue(generator = "ID_GENERATOR")
protected Long id;
public Long getId() { //Optional but useful
return id;
}
}
This is the most basic entity class, marked as “persistence capable” with the @Entity annotation, and with an @Id mapping for the database identifier property. The class maps by default to a table named ITEM in the database schema.
Every entity class has to have an @Id property; it’s how JPA exposes database identity to the application. We don’t show the identifier property in our diagrams; we assume that each entity class has one. In our examples, we always name the identifier property id. This is a good practice for your own project; use the same identifier property name for all your domain model entity classes. If you specify nothing else, this property maps to a primary key column named ID of the ITEM table in your database schema.
Hibernate will use the field to access the identifier property value when loading and storing items, not getter or setter methods. Because @Id is on a field, Hibernate will now enable every field of the class as a persistent property by default. The rule in JPA is this: if @Id is on a field, the JPA provider will access fields of the class directly and consider all fields part of the persistent state by default. You’ll see how to override this later in this chapter—in our experience, field access is often the best choice, because it gives you more freedom for accessor method design.
Should you have a (public) getter method for the identifier property? Well, the application often uses database identifiers as a convenient handle to a particular instance, even outside the persistence layer. For example, it’s common for web applications to display the results of a search screen to the user as a list of summaries. When the user selects a particular element, the application may need to retrieve the selected item, and it’s common to use a lookup by identifier for this purpose—you’ve probably already used identifiers this way, even in applications that rely on JDBC.
Should you have a setter method? Primary key values never change, so you shouldn’t allow modification of the identifier property value. Hibernate won’t update a primary key column, and you shouldn’t expose a public identifier setter method on an entity.
The Java type of the identifier property, java.lang.Long in the previous example, depends on the primary key column type of the ITEM table and how key values are produced. This brings us to the @GeneratedValue annotation and primary keys in general.
4.2.3 Selecting a primary key
The database identifier of an entity is mapped to some table primary key, so let’s first get some background on primary keys without worrying about mappings. Take a step back and think about how you identify entities.
A candidate key is a column or set of columns that you could use to identify a particular row in a table. To become the primary key, a candidate key must satisfy the following requirements:
■ The value of any candidate key column is never null. You can’t identify something with data that is unknown, and there are no nulls in the relational model. Some SQL products allow defining (composite) primary keys with nullable columns, so you must be careful.
■ The value of the candidate key column(s) is a unique value for any row.
■ The value of the candidate key column(s) never changes; it’s immutable.
The relational model defines that a candidate key must be unique and irreducible (no subset of the key attributes has the uniqueness property). Beyond that, picking a candidate key as the primary key is a matter of taste. But Hibernate expects a candidate key to be immutable when used as the primary key. Hibernate doesn't support updating primary key values with an API; if you try to work around this requirement, you'll run into problems with Hibernate's caching and dirty-checking engine. If your database schema relies on updatable primary keys (and maybe uses ON UPDATE CASCADE foreign key constraints), you must change the schema before it will work with Hibernate.
If a table has only one identifying attribute, it becomes, by definition, the primary key. But several columns or combinations of columns may satisfy these properties for a particular table; you choose between candidate keys to decide the best primary key for the table. You should declare candidate keys not chosen as the primary key as unique keys in the database if their value is indeed unique (but maybe not immutable).
Many legacy SQL data models use natural primary keys. A natural key is a key with business meaning: an attribute or combination of attributes that is unique by virtue of its business semantics. Examples of natural keys are the US Social Security Number and Australian Tax File Number. Distinguishing natural keys is simple: if a candidate key attribute has meaning outside the database context, it’s a natural key, regardless of whether it’s automatically generated. Think about the application users: if they refer to a key attribute when talking about and working with the application, it’s a natural key: “Can you send me the pictures of item #123-abc?”
Experience has shown that natural primary keys usually cause problems in the end. A good primary key must be unique, immutable, and never null. Few entity attributes satisfy these requirements, and some that do can’t be efficiently indexed by SQL databases (although this is an implementation detail and shouldn’t be the deciding factor for or against a particular key). In addition, you should make certain that a candidate key definition never changes throughout the lifetime of the database. Changing the value (or even definition) of a primary key, and all foreign keys that refer to it, is a frustrating task. Expect your database schema to survive decades, even if your application won’t.
Furthermore, you can often only find natural candidate keys by combining several columns in a composite natural key. These composite keys, although certainly appropriate for some schema artifacts (like a link table in a many-to-many relationship), potentially make maintenance, ad hoc queries, and schema evolution much more difficult. We talk about composite keys later in the book, in section 9.2.1.
For these reasons, we strongly recommend that you add synthetic identifiers, also called surrogate keys. Surrogate keys have no business meaning—they have unique values generated by the database or application. Application users ideally don’t see or refer to these key values; they’re part of the system internals. Introducing a surrogate
key column is also appropriate in the common situation when there are no candidate keys. In other words, (almost) every table in your schema should have a dedicated surrogate primary key column with only this purpose.
There are a number of well-known approaches to generating surrogate key values. The aforementioned @GeneratedValue annotation is how you configure this.
4.2.4 Configuring key generators
The @Id annotation is required to mark the identifier property of an entity class. Without the @GeneratedValue next to it, the JPA provider assumes that you’ll take care of creating and assigning an identifier value before you save an instance. We call this an application-assignedidentifier. Assigning an entity identifier manually is necessary when you’re dealing with a legacy database and/or natural primary keys. We have more to say about this kind of mapping in a dedicated section, 9.2.1.
Usually you want the system to generate a primary key value when you save an entity instance, so you write the @GeneratedValue annotation next to @Id. JPA standardizes several value-generation strategies with the javax.persistence.GenerationType enum, which you select with @GeneratedValue (strategy = ...) :
■ GenerationType.AUTO—Hibernate picks an appropriate strategy, asking the SQL dialect of your configured database what is best. This is equivalent to @GeneratedValue( ) without any settings.
■ GenerationType.SEQUENCE—Hibernate expects (and creates, if you use the tools) a sequence named HIBERNATE_SEQUENCE in your database. The sequence will be called separately before every INSERT, producing sequential numeric values.
■ GenerationType.IDENTITY—Hibernate expects (and creates in table DDL) a special auto-incremented primary key column that automatically generates a numeric value on INSERT, in the database.
■ GenerationType.TABLE—Hibernate will use an extra table in your database schema that holds the next numeric primary key value, one row for each entity class. This table will be read and updated accordingly, before INSERTs. The default table name is HIBERNATE_SEQUENCES with columns SEQUENCE_NAME and SEQUENCE_NEXT_HI_VALUE. (The internal implementation uses a more complex but efficient hi/lo generation algorithm; more on this later.)
Although AUTO seems convenient, you need more control, so you usually shouldn’t rely on it and explicitly configure a primary key generation strategy. In addition, most applications work with database sequences, but you may want to customize the name and other settings of the database sequence. Therefore, instead of picking one of the JPA strategies, we recommend a mapping of the identifier with @GeneratedValue(generator = "ID_GENERATOR"), as shown in the previous example.
This is a named identifier generator; you are now free to set up the ID_GENERATOR configuration independently from your entity classes.
JPA has two built-in annotations you can use to configure named generators: @javax.persistence.SequenceGenerator and @javax.persistence.TableGenerator. With these annotations, you can create a named generator with your own sequence and table names. As usual with JPA annotations, you can unfortunately only use them at the top of a (maybe otherwise empty) class, and not in a package-info.java file.
For this reason, and because the JPA annotations don’t give us access to the full Hibernate feature set, we prefer an alternative: the native @org.hibernate.annotations.GenericGenerator annotation. It supports all Hibernate identifier generator strategies and their configuration details. Unlike the rather limited JPA annotations, you can use the Hibernate annotation in a package-info.java file, typically in the same package as your domain model classes. The next listing shows a recommended configuration.
This Hibernate-specific generator configuration has the following advantages:
■ The enhanced-sequence 1 strategy produces sequential numeric values. If your SQL dialect supports sequences, Hibernate will use an actual database sequence. If your DBMS doesn’t support native sequences, Hibernate will manage and use an extra “sequence table,” simulating the behavior of a sequence. This gives you real portability: the generator can always be called before performing an SQL INSERT, unlike, for example, auto-increment identity columns, which produce a value on INSERT that has to be returned to the application afterward.
You can configure the sequence_name 2 . Hibernate will either use an existing sequence or create it when you generate the SQL schema automatically. If your DBMS doesn’t support sequences, this will be the special “sequence table” name.
You can start with an initial_value 3 that gives you room for test data. For example, when your integration test runs, Hibernate will make any new data insertions from test code with identifier values greater than 1000. Any test data you want to import before the test can use numbers 1 to 999, and you can refer to the stable identifier values in your tests: “Load item with id 123 and run some tests on it.” This is applied when Hibernate generates the SQL schema and sequence; it’s a DDL option.
You can start with an initial_value 3 that gives you room for test data. For example, when your integration test runs, Hibernate will make any new data insertions from test code with identifier values greater than 1000. Any test data you want to import before the test can use numbers 1 to 999, and you can refer to the stable identifier values in your tests: “Load item with id 123 and run some tests on it.” This is applied when Hibernate generates the SQL schema and sequence; it’s a DDL option.
You can share the same database sequence among all your domain model classes. There is no harm in specifying @GeneratedValue(generator = "ID_GENERATOR") in all your entity classes. It doesn’t matter if primary key values aren’t contiguous for a particular entity, as long as they’re unique within one table. If you’re worried about contention, because the sequence has to be called prior to every INSERT, we discuss a variation of this generator configuration later, in section 20.1.
Finally, you use java.lang.Long as the type of the identifier property in the entity class, which maps perfectly to a numeric database sequence generator. You could also use a long primitive. The main difference is what someItem. getId() returns on a new item that hasn’t been stored in the database: either null or 0. If you want to test whether an item is new, a null check is probably easier to understand for someone else reading your code. You shouldn’t use another integral type such as int or short for identifiers. Although they will work for a while (perhaps even years), as your database size grows, you may be limited by their range. An Integer would work for almost two months if you generated a new identifier each millisecond with no gaps, and a Long would last for about 300 million years.
Although recommended for most applications, the enhanced-sequence strategy as shown in listing 4.2 is just one of the strategies built into Hibernate.
4.2.5 Identifier generator strategies
Following is a list of all available Hibernate identifier generator strategies, their options, and our usage recommendations. If you don’t want to read the whole list now, enable GenerationType.AUTO and check what Hibernate defaults to for your database dialect. It’s most likely sequence or identity—a good but maybe not the most efficient or portable choice. If you require consistent portable behavior, and identifier values to be available before INSERTs, use enhanced-sequence, as shown in the previous section. This is a portable, flexible, and modern strategy, also offering various optimizers for large datasets.
We also show the relationship between each standard JPA strategy and its native Hibernate equivalent. Hibernate has been growing organically, so there are now two sets of mappings between standard and native strategies; we call them Old and New in the list. You can switch this mapping with the hibernate. id.new_generator_mappings
Generating identifiers before or after INSERT: what’s the difference?
An ORM service tries to optimize SQL inserts: for example, by batching several at the JDBC level. Hence, SQL execution occurs as late as possible during a unit of work, not when you call entityManager.persist(someltem). This merely queues the insertion for later execution and, if possible, assigns the identifier value. But if you now call someltem.getld(), you might get null back if the engine wasn't able to generate an identifier before the INSERT. In general, we prefer pre-insert generation strategies that produce identifier values independently, before INSERT. A common choice is a shared and concurrently accessible database sequence. Auto-incremented columns, column default values, or trigger-generated keys are only available after the INSERT.
setting in your persistence.xml file. The default is true; hence the New mapping. Software doesn’t age quite as well as wine:
■ native—Automatically selects other strategies, such as sequence or identity, depending on the configured SQL dialect. You have to look at the Javadoc (or even the source) of the SQL dialect you configured in persistence.xml. Equivalent to JPA GenerationType .AUTO with the Old mapping.
■ sequence—Uses a native database sequence named HIBERNATE_SEQUENCE. The sequence is called before each INSERT of a new row. You can customize the sequence name and provide additional DDL settings; see the Javadoc for the class org.hibernate.id.SequenceGenerator.
■ sequence-identity—Generates key values by calling a database sequence on insertion: for example, insert into ITEM(ID) values (HIBERNATE_SEQUENCE .nextval). The key value is retrieved after INSERT, the same behavior as the identity strategy. Supports the same parameters and property types as the sequence strategy; see the Javadoc for the class org.hibernate.id.Sequence-IdentityGenerator and its parent.
■ enhanced-sequence—Uses a native database sequence when supported; otherwise falls back to an extra database table with a single column and row, emulating a sequence. Defaults to name HIBERNATE_SEQUENCE. Always calls the database “sequence” before an INSERT, providing the same behavior independently of whether the DBMS supports real sequences. Supports an org.hibernate .id.enhanced.Optimizer to avoid hitting the database before each INSERT; defaults to no optimization and fetching a new value for each INSERT. You can find more examples in chapter 20. For all parameters, see the Javadoc for the class org.hibernate.id.enhanced.SequenceStyleGenerator. Equivalent toJPA GenerationType.SEQUENCE and GenerationType.AUTO with the New mapping enabled, most likely your best option of the built-in strategies.
■ seqhilo—Uses a native database sequence named HIBERNATE_SEQUENCE, optimizing calls before INSERT by combining hi/lo values. If the hi value retrieved
from the sequence is 1, the next 9 insertions will be made with key values 11, 12, 13, ..., 19. Then the sequence is called again to obtain the next hi value (2 or higher), and the procedure repeats with 21, 22, 23, and so on. You can configure the maximum lo value (9 is the default) with the max_lo parameter. Unfortunately, due to a quirk in Hibernate’s code, you can not configure this strategy in @GenericGenerator. The only way to use it is with JPA Generation-Type.SEQUENCE and the Old mapping. You can configure it with the standard JPA @SequenceGenerator annotation on a (maybe otherwise empty) class. See the Javadoc for the class org.hibernate.id.SequenceHiLoGenerator and its parent for more information. Consider using enhanced-sequence instead, with an optimizer.
■ hilo—Uses an extra table named HIBERNATE_UNIQUE_KEY with the same algorithm as the seqhilo strategy. The table has a single column and row, holding the next value of the sequence. The default maximum lo value is 32767, so you most likely want to configure it with the max_lo parameter. See the Javadoc for the class org.hibernate.id.TableHiLoGenerator for more information. We don’t recommend this legacy strategy; use enhanced-sequence instead with an optimizer.
■ enhanced-table—Uses an extra table named HIBERNATE_SEQUENCES, with one row by default representing the sequence, storing the next value. This value is selected and updated when an identifier value has to be generated. You can configure this generator to use multiple rows instead: one for each generator; see the Javadoc for org.hibernate.id.enhanced.TableGenerator. Equivalent to JPA GenerationType.TABLE with the New mapping enabled. Replaces the outdated but similar org.hibernate.id.MultipleHiLoPerTableGenerator, which is the Old mapping for JPA GenerationType.TABLE.
■ identity—Supports IDENTITY and auto-increment columns in DB2, MySQL, MS SQL Server, and Sybase. The identifier value for the primary key column will be generated on INSERT of a row. Has no options. Unfortunately, due to a quirk in Hibernate’s code, you can not configure this strategy in @GenericGenerator. The only way to use it is with JPA GenerationType.IDENTITY and the Old or New mapping, making it the default for GenerationType. IDENTITY.
■ increment—At Hibernate startup, reads the maximum (numeric) primary key column value of each entity’s table and increments the value by one each time a new row is inserted. Especially efficient if a non-clustered Hibernate application has exclusive access to the database; but don’t use it in any other scenario.
■ select—Hibernate won’t generate a key value or include the primary key column in an INSERT statement. Hibernate expects the DBMS to assign a (default in schema or by trigger) value to the column on insertion. Hibernate then retrieves the primary key column with a SELECT query after insertion. Required parameter is key, naming the database identifier property (such as id) for the
SELECT. This strategy isn’t very efficient and should only be used with old JDBC drivers that can’t return generated keys directly.
■ uuid2—Produces a unique 128-bit UUID in the application layer. Useful when you need globally unique identifiers across databases (say, you merge data from several distinct production databases in batch runs every night into an archive). The UUID can be encoded either as a java.lang.String, a byte[16], or a java .util.UUID property in your entity class. Replaces the legacy uuid and uuid .hex strategies. You configure it with an org.hibernate.id.UUIDGeneration-Strategy; see the Javadoc for the class org.hibernate.id.UUIDGenerator for more details.
■ guid—Uses a globally unique identifier produced by the database, with an SQL function available on Oracle, Ingres, MS SQL Server, and MySQL. Hibernate calls the database function before an INSERT. Maps to a java.lang.String identifier property. If you need full control over identifier generation, configure the strategy of @GenericGenerator with the fully qualified name of a class that implements the org.hibernate.id.IdentityGenerator interface.
To summarize, our recommendations on identifier generator strategies are as follows:
■ In general, we prefer pre-insert generation strategies that produce identifier values independently before INSERT.
■ Use enhanced-sequence, which uses a native database sequence when supported and otherwise falls back to an extra database table with a single column and row, emulating a sequence.
We assume from now on that you’ve added identifier properties to the entity classes of your domain model and that after you complete the basic mapping of each entity and its identifier property, you continue to map the value-typed properties of the entities. We talk about value-type mappings in the next chapter. Read on for some special options that can simplify and enhance your class mappings.
Hibernate, Mapping Entities With Identity >>>>> Download Now
ReplyDelete>>>>> Download Full
Hibernate, Mapping Entities With Identity >>>>> Download LINK
>>>>> Download Now
Hibernate, Mapping Entities With Identity >>>>> Download Full
>>>>> Download LINK df