一、開發效率高
Spring Batch提高了開發者的生產力,使得批處理應用程序的開發變得容易和高效。它提供了一組用於定義簡單批處理作業的元數據,這些元數據可以輕鬆地擴展,以支持更複雜的流程。在Spring Batch中,開發者只需要定義作業的輸入(Reader)、處理(Processor)和輸出(Writer)節點即可,而不需要像原始的Java批處理API那樣編寫很多模板和模式。另外,Spring Batch 還提供了大量的工具,如JobLauncher和JobOperator ,使得作業的啟動和管理變得十分簡單。
以下是一個簡單的示例,展示如何使用Spring Batch來實現在XML或者數據庫中讀取人員信息,處理並輸出到XML和文本文件當中。代碼中包含兩個不同的Job,分別讀取XML和數據庫中的數據並輸出到不同的地方。
@Configuration @EnableBatchProcessing public class BatchConfiguration { private final JobBuilderFactory jobBuilderFactory; private final StepBuilderFactory stepBuilderFactory; @Autowired public BatchConfiguration(JobBuilderFactory jobBuilderFactory, StepBuilderFactory stepBuilderFactory) { this.jobBuilderFactory = jobBuilderFactory; this.stepBuilderFactory = stepBuilderFactory; } @Bean public ItemReader xmlPersonReader() { StaxEventItemReader reader = new StaxEventItemReader(); reader.setResource(new ClassPathResource("persons.xml")); reader.setFragmentRootElementName("person"); reader.setUnmarshaller(personUnmarshaller()); return reader; } @Bean public Unmarshaller personUnmarshaller() { Jaxb2Marshaller marshaller = new Jaxb2Marshaller(); marshaller.setClassesToBeBound(Person.class); return marshaller; } @Bean public ItemReader dbPersonReader(JdbcTemplate jdbcTemplate) { JdbcCursorItemReader reader = new JdbcCursorItemReader(); reader.setDataSource(jdbcTemplate.getDataSource()); reader.setSql("select id, first_name, last_name, email, age from person"); reader.setRowMapper((resultSet, i) -> { Person person = new Person(); person.setId(resultSet.getLong("id")); person.setFirstName(resultSet.getString("first_name")); person.setLastName(resultSet.getString("last_name")); person.setEmail(resultSet.getString("email")); person.setAge(resultSet.getInt("age")); return person; }); return reader; } @Bean public ItemWriter xmlPersonWriter() { StaxEventItemWriter writer = new StaxEventItemWriter(); writer.setResource(new FileSystemResource("persons.xml")); writer.setRootTagName("persons"); writer.setMarshaller(personMarshaller()); writer.setOverwriteOutput(true); return writer; } @Bean public Marshaller personMarshaller() { Jaxb2Marshaller marshaller = new Jaxb2Marshaller(); marshaller.setClassesToBeBound(Person.class); return marshaller; } @Bean public ItemWriter txtPersonWriter() throws Exception { FlatFileItemWriter writer = new FlatFileItemWriter(); writer.setResource(new FileSystemResource("persons.txt")); writer.setLineAggregator(lineAggregator()); writer.setAppendAllowed(true); writer.setHeaderCallback(writer1 -> { writer1.write("Id\tFirst Name\tLast Name\tEmail\tAge\n"); }); writer.afterPropertiesSet(); return writer; } @Bean public LineAggregator lineAggregator() { DelimitedLineAggregator aggregator = new DelimitedLineAggregator(); aggregator.setDelimiter("\t"); aggregator.setFieldExtractor(fieldExtractor()); return aggregator; } @Bean public FieldExtractor fieldExtractor() { BeanWrapperFieldExtractor extractor = new BeanWrapperFieldExtractor(); extractor.setNames(new String[] {"id", "firstName", "lastName", "email", "age"}); return extractor; } @Bean public Step xmlPersonStep(ItemReader xmlPersonReader, ItemWriter txtPersonWriter, ItemWriter xmlPersonWriter) { return stepBuilderFactory.get("xmlPersonReadingStep") .chunk(10) .reader(xmlPersonReader) .processor(personUpperProcessor()) .writer(txtPersonWriter) .writer(xmlPersonWriter) .build(); } @Bean public Step dbPersonStep(ItemReader dbPersonReader, ItemWriter txtPersonWriter) { return stepBuilderFactory.get("dbPersonReadingStep") .chunk(10) .reader(dbPersonReader) .processor(personUpperProcessor()) .writer(txtPersonWriter) .build(); } @Bean public Job job(Step dbPersonStep, Step xmlPersonStep) { return jobBuilderFactory.get("personJob") .incrementer(new RunIdIncrementer()) .start(xmlPersonStep) .next(dbPersonStep) .build(); } @Bean public ItemProcessor personUpperProcessor() { return person -> { person.setFirstName(person.getFirstName().toUpperCase()); person.setLastName(person.getLastName().toUpperCase()); person.setEmail(person.getEmail().toUpperCase()); return person; }; } }
二、高性能處理大規模數據
Spring Batch 能夠高效處理大規模的數據,其內部實現了優化批處理的算法,可以大大提升批處理的處理速度、效率和吞吐量。在處理大規模數據的過程中,Spring Batch 的每個步驟都是按照一定的要求進行分割的,以便能夠並行處理數據,這樣才能最大化地利用多核處理器的性能優勢。
以下代碼展示了如何在Spring Batch中使用多線程並行處理大文件:
@Configuration @EnableBatchProcessing public class BatchConfiguration { private final JobBuilderFactory jobBuilderFactory; private final StepBuilderFactory stepBuilderFactory; @Autowired public BatchConfiguration(JobBuilderFactory jobBuilderFactory, StepBuilderFactory stepBuilderFactory) { this.jobBuilderFactory = jobBuilderFactory; this.stepBuilderFactory = stepBuilderFactory; } @Bean public FlatFileItemReader reader() { FlatFileItemReader reader = new FlatFileItemReader(); reader.setResource(new ClassPathResource("data.txt")); reader.setLineMapper(lineMapper()); return reader; } @Bean public LineMapper lineMapper() { DefaultLineMapper lineMapper = new DefaultLineMapper(); DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer(); tokenizer.setDelimiter(","); tokenizer.setNames(new String[] {"id", "filename", "content"}); lineMapper.setLineTokenizer(tokenizer); lineMapper.setFieldSetMapper(new FileContentFieldSetMapper()); return lineMapper; } @Bean public ItemProcessor processor() { return fileContent -> { fileContent.setContent(fileContent.getContent().toUpperCase()); return fileContent; }; } @Bean public ItemWriter writer() { FlatFileItemWriter writer = new FlatFileItemWriter(); writer.setResource(new FileSystemResource("processed-data.txt")); writer.setLineAggregator(new DelimitedLineAggregator() { { setDelimiter(","); setFieldExtractor(new BeanWrapperFieldExtractor() { { setNames(new String[] {"id", "filename", "content"}); } }); } }); return writer; } @Bean public Step step1() { return stepBuilderFactory.get("step1") .chunk(1000) .reader(reader()) .processor(processor()) .writer(writer()) .taskExecutor(taskExecutor()) .build(); } @Bean public TaskExecutor taskExecutor() { ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor(); taskExecutor.setCorePoolSize(10); taskExecutor.setMaxPoolSize(20); taskExecutor.afterPropertiesSet(); return taskExecutor; } @Bean public Job job() { return jobBuilderFactory.get("job") .incrementer(new RunIdIncrementer()) .start(step1()) .build(); } }
三、易於維護和測試
Spring Batch提供了許多易於維護和測試的功能。Spring Batch的作業和步驟都是單獨的Java類,這使得它們易於定義、修改和測試。此外,Spring Batch 還提供了一系列的工具和類,如 ItemReader、ItemWriter、ItemProcessor 等,方便開發者進行單元測試和集成測試。
以下代碼展示如何在Spring Batch的項處理器中使用JUnit進行單元測試:
public class PersonItemProcessorTest { private ItemProcessor processor = new PersonItemProcessor(); @Test public void testProcessor() throws Exception { Person person = new Person(); person.setFirstName("Alice"); person.setLastName("Smith"); person.setEmail("alice.smith@example.com"); person.setAge(30); PersonDto personDto = processor.process(person); assertNotNull(personDto); assertEquals("ALICE SMITH", personDto.getName()); assertEquals("alice.smith@example.com", personDto.getEmail()); } }
四、缺點
Spring Batch的主要缺點是其學習曲線較長,需要一定的時間去學習和了解其核心概念及其用法。另外,一些高級功能,如調度和監控等,需要進行一定的配置和設置,需要開發者有一定的經驗才能輕鬆理解和使用。
原創文章,作者:OQANF,如若轉載,請註明出處:https://www.506064.com/zh-hk/n/333138.html