Fixefid 1.0.0 Released!

Fixefid 1.0.0 has been released!

Advertisements

I’m very proud to announce Fixefid 1.0.0 has been released!

Fixefid is a Java library for working with flat fixed formatted text files.

Fixefid Home Page

Fixefid Javadoc

Fixefid Maven Repository

Disable Spring Cloud Stream support for testing

TestSupportBinder is a minimal binder that does nothing and is not useful for integration test between services

A short time ago we started the development of a project based on Microservices Architecture. The intent is to create services based on REST Api, which receive messages and write them on Kafka topics. Other services read messages from the Kafka topics and write them to the database.

The environment is as follows:

  • Java 11
  • Spring Core 5.1
  • Spring Boot 2.1
  • Spring Cloud Stream 2.1
  • Spring Web 5.1
  • Apache Kafka 2.0.1

The IDE is the new STS 4.  By default, if you use Spring Initializer to create a Spring Cloud Stream based project, support for the Spring Cloud Test dependency is added:

<dependency>
     <groupId>org.springframework.cloud</groupId>
     <artifactId>spring-cloud-stream-test-support</artifactId>
     <scope>test</scope>
</dependency>

the dependency ensures that the TestSupportBinder class can be used for the test phase. TestSupportBinder is a minimal binder that does nothing about binding consumers.
I find that class not very useful, even for the test phase itself. Surely it is not useful for integration test between the various services.
In fact, in our case, by launching services from STS, you could write on the topic but the listeners received no messages. By launching services via maven instead, everything worked fine. This is because the scope test of the dependency assures that by launching the command

mvn spring-boot:run

the TestSupportBinder class is not loaded by Spring Boot autoconfiguration. However, if we run the application from STS (in RUN or DEBUG), TestSupportBinder is loaded, and not Kafka as desired. To disable the test from STS, you need to add the annotation

@SpringBootApplication(exclude = TestSupportBinderAutoConfiguration.class)

as in this example:

@SpringBootApplication(exclude = TestSupportBinderAutoConfiguration.class)
@EnableBinding(MsgStreams.class)
public class StreamsConfig {

}

or, even better, add the following line in the application.properties file:

spring.autoconfigure.exclude=org.springframework.cloud.stream.test.binder.TestSupportBinderAutoConfiguration

in this way, the integration test between the various services will go well!

Partial commit and job restart with Spring Batch

If a batch fails after partial commit, it must be possible to start again with the processing of the file by skipping the lines already committed

A classic batch is the processing of a file, for which records are read, for each record the data are processed and are persisted on database (reader, processor and writer).

In case the file is large and contains thousands of records, partial commits must be expected during processing. For example, every 1000 records, we can decide to commit the processing on the database.

Through Spring Batch it is very easy to get partial commits, it’s a simple parameter that is passed to the StepBuilder. In the case that it’s necessary to implement more complex partial commit policies, it is possible to implement custom completition policies (which in this case we will not see because it is not the subject of this article).

Finally, if a batch fails after partial commit, it must be possible to start again with the processing of the file by skipping the lines already committed. This last thing is also expected by Spring Batch, but we must add a few lines of code, it is not a simple configuration parameter. Also in this case it’s possible to implement custom skip and retry policies (and also in this case we will not see why not the subject of this article).

The environment is as follows:

  • Java7
  • Spring Boot 1.1.8
  • Spring Batch 3.0

There are two concepts, the job instance and job execution. An instance of a job is accomplished through n executions (typically one, or more if there have been failures). Furthermore, only a failed job can be restarted.

To implement the restart we need the jobRegistry, jobOperator, jobExplorer and jobLauncher. Here is the complete code of the batch configuration.

Basically these are the steps:

  • register the job in the jobRegistry
  • get job instances through the jobOperator
  • given the last instance, get executions through the jobOperator
  • through the jobExplorer check if the last execution has failed
  • in case the last execution has failed, the job must be restarted via the jobOperator
  • in case the last execution was successful, launch a new job instance via the jobLauncher

Spring Batch takes care of managing the restart starting from the first uncommitted record.

Thanks Spring Batch 🙂

 

 

 

 

Performance tuning for File Reading with Java 8 and Parallel Streams

Is it always convenient to use parallel streams?

Java 8 has introduced with the Stream the possibility of using in a very simple way all the resources made available by the hardware, in particular the cores of the multicore architectures. And all this with the paradigm of declarative and non-imperative programming.

For example, suppose you have to do a batch that looks for a string in all the files that have a certain prefix in a certain directory. The batch must notify in a log in which files the string is found.

It is interesting to compare the two approaches, pre java 8 and java 8 with the streams.

I used my dev computer, a Samsung laptop with this features:

  • Intel Core i7-4500U (2 Core, 4 Thread) 1.8GH
  • 8 GB RAM
  • 256 GB SSD
  • Windos 8.1 Pro 64 Bit
  • Java 1.8.0_101

Having 4 Threads I expect the performance of the parallel stream solution to be 400% better than the classic pre java 8 version.

The code that uses the pre java style 8 is as follows: StringMatchingOld

The code that uses the streams is as follows: StringMatching

The code that uses the parallel streams is as follows: ParallelStringMatching

These are the results (4 files read):

File Reading Performance

What is going on? Truly strange, there is no benefit in the use of the stream (I think also from the point of view of the code… I don’t think it’s more readable but this is another story… I’m getting older…). Furthermore, we see that using parallel streams does nothing but make things worse.

Conclusions

It seems that reading the files does not find any improvement in the use of the streams. However, the fact remains that declarative programming makes it possible to abstract from current hardware, so maybe on a server with better hardware than my development pc, parallel streams performance is better.

Or maybe I’m doing something wrong … I’m looking forward to that.

 

Data Masking with JPA and Spring Security

The protection of sensitive data is an increasingly popular topic in IT applications

The protection of sensitive data is an increasingly popular topic in IT applications. Also in our case, a customer asked us, on an already existing web application, to implement a data masking solution that is dynamic and based on security profiles.

The application is developed in Java, with Spring MVC for the management of the Model View Controller, JPA for data access and Spring Security for the management of security profiles.

There are two approaches in literature: SDM (Static Data Masking) and DDM (Dynamic Data Masking).

SDM

SDM plans to clone the current database by masking sensitive data. Specific inquiry applications that provide data masking can read from the cloned database.

Advantages:

  • performance of data access at runtime

Disadvantages:

  • data read can be not updated (update takes place via batch and, depending on the mode, the update can last from minutes to hours)
  • not ideal for a role-based / field-based security scenario

DDM

DDM plans to mask data when it is read at runtime.

Advantages:

  • real data reading,
  • ideal for a role-based / field-based security scenario

Disadvantages:

  • read / write overhead performance
  • possible unmusk algorithms to avoid data corruption (to prevent the masked data from persisting on the DB)

Given the customer’s requests, the DDM technique is the one that best suits a dynamic scenario based on security profiles.

At this point another choice had to be made because for DDM there are two approaches:

JPA Rewriting

In the literature we talk about SQL Rewriting, in our specific case JPA rewriting, JPA being our data access layer. The data is masked in a PostLoad or PostUpdate annotated method of a JPA Entity Listener, that means in the persistent layer.

Advantages:

  • punctual masking of the data in the load phase from the DB
  • easy data-masking mapping

Disadvantages:

  • masking depending on the data type (for example a string can be masked with ‘***’, or with ‘###’, a number with ‘000’ or ‘999’, a date with ’99 / 99 / 9999 ‘, etc etc …)
  • difficulty in the Look & Feel for rendering the view if the data is masked (each view should declare the masking … re-enter in the case of View rewriting below)
  • unmask algorithms that use the user session to store unmasked data. JPA makes shering of objects loaded by DB, so it is not said that an object loaded by an inquiry function is not then used for an update function. In this case the masked data would be persisted on DB, that means data corruption
  • complex make the masking dependent on the function (use of the user session for function-masking mapping)
  • complex use of the user session (see above for unmask and function-masking mapping)

View Rewriting

The data is masked in the presentation layer, typically in jsp pages.

Advantages:

  • homogeneous masking (does not depend on the type of data, everything can be masked for example with ‘***’)
  • it is not required unmusk phase
  • easy rendering for a look & feel (each view declares whether or not it wants masking)
  • easy to make it dependent function (each function declare whether or not it wants masking)

Disadvantages:

  • not punctual masking (all the views must mask … the tags reused by the view simplify, but not completely)
  • difficult data-masking mapping (each view must declare the data)

We chose to adopt the View Rewriting, because analyzing the effort (which in this article omits because not relevant), it was, more or less, similar between the two approaches, while the risk of data corruption and out of memory exceptions of user session are absent. Moreover the View Rewriting solution is much more customizable for what concerns the Look & Feel

To implement the solution we need the following things in detail:

  • a generic editor to enable or not a field for masking
  • a masking class that performs data masking based on security profiles
  • to modify all existing views to use the masking class above

Let’s see in detail

Role-based security mapping

We use a role-based security mapping based on Spring Security (already present in the application). For any data that you want to mask, a role is created made like this:

ROLE_MASK_DOMAIN-NAME_FIELD-NAME

for example, if I want to mask the tax code field of the people table, since the field is mapped via JPA in Person.taxCode, the role will be

ROLE_MASK_PERSON_TAXCODE

The mapping editing is managed dynamically with a special GUI function. We used the existing Domanin Editor function, a generic domain editor that for all domain classes it allows the modification of all the fields mapped to the database.
We have added a new editing form for managing data-masking mapping.
The form will contain all the fields of the chosen domain class. For each field you can choose (with a special checkbox) whether or not to enable the relative masking. When saving, the function performs the following steps:

  • look in the Authorities table if the role ROLE_MASK_DOMAIN-NAME_FIELD-NAME exists. If it does not exist it creates (the opposite if the field must be disabled)

For mapping with profiles (Spring Security Groups) are used the already present Spring Security functions implemented in the appropriate View of the application.

Masking class

Creation of a class that receives as input the data to be masked and its name (for example, Person.taxCode).
The class looks for (with the methods that provide Spring Secutiry) if the current user’s profile is associated with the corresponding field (ROLE_MASK_PERSON_TAXCODE / Person.taxCode). If exists, the class mask the data and returns it to the view.

Change Views

The functions that provide for masking the data are typically those of inquiry. In our case it helps us the fact that we have adopted tags in the presentation layer so all the shows and lists use a display.tagx tag and a table.tagx tag. We need to change these two tags to make them use the masking class.
The longest work concerns modifies all jsps that use the two tags, which must declare the name of the field they are viewing.

Finally we have also modified the search filters to make sure that if the filter provides the search for a field that must be masked, the filter is disabled.
For example, if the filter requires a search for tax code, the filter must use the masking class to know at runtime if the profile expects to mask this data.
If so, the filter is disabled.

Conclusions

View Rewriting with role based security is the best solution for the following reasons:

  • effort slightly greater than the JPA Rewriting solution but more or less similar
  • use of spring security to map the data to be masked to the profile
  • greater custom in terms of look & feel
  • absence of data corruption risk
  • absence of user session out of memory risk

IBAN, iban4j and CIN calculation

The International Bank Account Number (IBAN) is used to uniquely identify bank details internationally.

The code is as follows:

  • 2 capital letters representing the Nation (IT for Italy)
  • 2 control digits
  • the national BBAN code

For Italy, the BBAN code (Basic Bank Account Number) is composed of:

  • CIN (1 uppercase letter)
  • ABI (5 digits)
  • CAB (5 digits)
  • Account number (12 alphanumeric characters possibly preceded by zeros if the number of characters is less than 12)

The CIN (Control Internal Number) code consists of a single letter and is used as a control character. It’s calculated based on the ABI and CAB codes and account number.

Both the two control digits and the CIN can be calculated to verify that the IBAN entered in a form by a user is valid and compliant. To do this in java there is the iban4j library.

The problem with this library is that it calculates the two control digits but not the CIN. On the net I didn’t find any java library that made the CIN calculation. I only found an example written in Visual Basic  of which I ported in java. The class name is CINUtil and it can be downloaded from PasteBin.

A method for checking the iban inserted in a form by a user can be the following:


public static boolean checkIban(String ibanCode) {
String countryCode = ibanCode.substring(0, 2);
String abi = ibanCode.substring(5, 10);
String cab = iban.substring(10, 15);
String conto = iban.substring(15);


org.iban4j.Iban ibanToCheck = new org.iban4j.Iban.Builder()
.countryCode(CountryCode.valueOf(countryCode))
.bankCode(abi)
.nationalCheckDigit(CINUtil.computeCin(abi, cab, conto))
.branchCode(cab)
.accountNumber(conto)
.build(true);


return ibanCode.equals(ibanToCheck.toString());
}

Howto migrate LDAP users to Spring Security JDBC

Howto migrate LDAP users to Spring Security JDBC

Lately we are redoing a web portal whose users who have access to the private area of ​​the application are registered through LDAP.
The only request made by the contractor is that the transition to the new portal must be transparent to registered users. This translates into the fact that users do not have to change the password the first time they log in to the new portal. Passwords are stored in the LDAP repository with SSHA (Salted SHA) encoding.
Our application uses Spring Security to manage security and access to the reserved area. Spring Security supports various types of authentication including LDAP itself. As a first idea we thought to use the same LDAP repository already present. After analyzing this solution in detail we have thought not to take this solution for various reasons.
The first is to map the roles related to the permissions of the old application to the roles of our application (feasible but not very nice from the point of functional view). Furthermore, having to maintain two separate servers, an LDAP and a DBMS, when it is possible to have only one DBMS server, is not a good thing from the point of management costs.
So we thought about using the classic Spring Security JDBC authentication. The users will be migrated through a batch that will load the unloading LDAP users (download, for example, done in the csv format) to the JDBC tables. To ensure that password encryption remains the same, just configure Spring Security to use the LdapShaPasswordEncoder class. To do this you need to define the following bean in WebMvcConfiguration:

@Bean
public LdapShaPasswordEncoder passwordEncoderLDAP () {
return new LdapShaPasswordEncoder ();
}

end using it in the AuthenticationManagerBuilder defined in WebSecurityConfiguration like this:

@Autowired
private LdapShaPasswordEncoder ldapPasswordEncoder;

@Override
protected void configure (AuthenticationManagerBuilder auth) throws Exception {
auth.userDetailsService (webJdbcUserDetailsManager).passwordEncoder (ldapPasswordEncoder);
}

in this way the password coding will be the same used by LDAP and for users the transition to the new portal will be transparent.