Spring Batch : CSV to Database
- General
Spring Batch : CSV to Database
Introduction To Spring Batch
Spring Batch provides reusable functions that are essential in processing large volumes of records, including logging/tracing, transaction management, job processing statistics, job restart, skip, and resource management. It also provides more advanced technical services and features that will enable extremely high-volume and high performance batch jobs through optimization and partitioning techniques. Simple as well as complex, high-volume batch jobs can leverage the framework in a highly scalable manner to process significant volumes of information.
Features
- Transaction management
- Chunk based processing
- Declarative I/O
- Start/Stop/Restart
- Retry/Skip
- Web based administration interface
Use-Case
In this article, we’ll go through how to create a Spring Batch Job from start to end for inserting high-volume data from CSV into a database with the help of Batch Core components which otherwise is a tedious process to do i.e. create a statement and execute the query.
Apart from the previously used strategies, Spring Batch provides us with an Admin Dashboard to monitor jobs, outcomes, and logs. And with @Transactional Annotation, we can roll back the entire transaction steps if an error occurred.
Spring Batch Core Components And Flow
Creating a Spring Batch Project
Spring Project Configuration
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver spring.datasource.url = jdbc:mysql://localhost:3306/DB_Name spring.datasource.username = username spring.datasource.password = password #spring.jpa.show-sql = true spring.jpa.hibernate.ddl-auto = update spring.jpa.properties.hibernate.dialect = org.hibernate.dialect.MySQL5Dialect #server.port=9191 spring.batch.initialize-schema=ALWAYS #disabled job run at startup spring.batch.job.enabled=false
Creating entity for our CSV File
@Entity @Table(name = "CUSTOMERS_INFO") @Data @AllArgsConstructor @NoArgsConstructor public class Customer { @Id @Column(name = "CUSTOMER_ID") private int id; @Column(name = "FIRST_NAME") private String firstName; @Column(name = "LAST_NAME") private String lastName; @Column(name = "EMAIL") private String email; @Column(name = "GENDER") private String gender; @Column(name = "CONTACT") private String contactNo; @Column(name = "COUNTRY") private String country; @Column(name = "DOB") private String dob; }
Configuration For Job, Step, ItemReader, ItemWriter, ItemProcessor
@Configuration @EnableBatchProcessing @AllArgsConstructor public class SpringBatchConfig { private JobBuilderFactory jobBuilderFactory; private StepBuilderFactory stepBuilderFactory; private CustomerRepository customerRepository; @Bean public FlatFileItemReader<Customer> reader() { FlatFileItemReader<Customer> itemReader = new FlatFileItemReader<>(); itemReader.setResource(new FileSystemResource("src/main/resources/customers.csv")); itemReader.setName("csvReader"); itemReader.setLinesToSkip(1); itemReader.setLineMapper(lineMapper()); return itemReader; } private LineMapper<Customer> lineMapper() { DefaultLineMapper<Customer> lineMapper = new DefaultLineMapper<>(); DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer(); lineTokenizer.setDelimiter(","); lineTokenizer.setStrict(false); lineTokenizer.setNames("id", "firstName", "lastName", "email", "gender", "contactNo", "country", "dob"); BeanWrapperFieldSetMapper<Customer> fieldSetMapper = new BeanWrapperFieldSetMapper<>(); fieldSetMapper.setTargetType(Customer.class); lineMapper.setLineTokenizer(lineTokenizer); lineMapper.setFieldSetMapper(fieldSetMapper); return lineMapper; } @Bean public CustomerProcessor processor() { return new CustomerProcessor(); } @Bean public RepositoryItemWriter<Customer> writer() { RepositoryItemWriter<Customer> writer = new RepositoryItemWriter<>(); writer.setRepository(customerRepository); writer.setMethodName("save"); return writer; } @Bean public Step step1() { return stepBuilderFactory.get("csv-step").<Customer, Customer>chunk(10) .reader(reader()) .processor(processor()) .writer(writer()) .taskExecutor(taskExecutor()) .build(); } @Bean public Job runJob() { return jobBuilderFactory.get("importCustomers") .flow(step1()).end().build(); } @Bean public TaskExecutor taskExecutor() { SimpleAsyncTaskExecutor asyncTaskExecutor = new SimpleAsyncTaskExecutor(); asyncTaskExecutor.setConcurrencyLimit(10); return asyncTaskExecutor; } }
Custom Item Processor
public class CustomerProcessor implements ItemProcessor<Customer,Customer> { @Override public Customer process(Customer customer) throws Exception { if(customer.getCountry().equals("United States")) { return customer; }else{ return null; } } }
Customer Repository
public interface CustomerRepository extends JpaRepository<Customer,Integer> { }
Job Controller
@RestController @RequestMapping("/jobs") public class JobController { @Autowired private JobLauncher jobLauncher; @Autowired private Job job; @PostMapping("/importCustomers") public void importCsvToDBJob() { JobParameters jobParameters = new JobParametersBuilder() .addLong("startAt", System.currentTimeMillis()).toJobParameters(); try { jobLauncher.run(job, jobParameters); } catch (JobExecutionAlreadyRunningException | JobRestartException | JobInstanceAlreadyCompleteException | JobParametersInvalidException e) { e.printStackTrace(); } } }
Batch Job Output in DB
Related content
Toll mangement and command centre with TMCC
We’re passionately committed to helping our clients and their customers thrive, working side by side to drive customer value and results..
A Smarter Health Safety Solution
We’re passionately committed to helping our clients and their customers thrive, working side by side to drive customer value and results..
Building fastest loan portal in India
We’re passionately committed to helping our clients and their customers thrive, working side by side to drive customer value and results..
Toll mangement and command centre with TMCC
We’re passionately committed to helping our clients and their customers thrive, working side by side to drive customer value and results...
Toll mangement and command centre with TMCC
We’re passionately committed to helping our clients and their customers thrive, working side by side to drive customer value and results..
Entreprise IT Transformation and Automation
We understand user and market, create product strategy and design experience for customers and employees to make breakthrough digital products and services