Finding a better way to solve batch issues: Concurrent VSAM file sharing
When batch processes cause CICS applications to be offline and data to be inaccessible productivity stalls, crucial information is unavailable, and decisions are delayed. This is important because CICS® applications drive a large percentage of the transactional processing at the core of worldwide commerce and information systems. According to current estimates, CICS applications handle more than 30 billion transactions per day and process more than $1 trillion dollars' worth of business every week.
For many organizations, VSAM data is important to these CICS applications. The dilemma of finding time to let batch update VSAM but also keep CICS up becomes more relevant to businesses and their customers as the businesses try to operate for more hours per day, especially online. In fact, 84 percent of respondents in the 2018 Arcati survey said specifically that they're web-enabling CICS subsystems.1
This white paper examines these issues and offers a solution: concurrent VSAM file sharing that creates a virtual processing window in CICS. CICS-VSAM-batch file sharing is a proven method for leveraging high-performance mainframe systems without disrupting users or altering source code.
This paper covers the following four topics that are critical in evaluating a VSAM file-sharing solution:
- Architecture
- Performance, reliability, and scalability
- Recovery
- CICSplex and Sysplex scalability
Reducing the batch window or sharing VSAM files inside CICS
You have two general options when dealing with batch window issues. You can either keep tuning batch processes, attempting to reduce the duration of the batch window, or you can break out of the conventional paradigm entirely and consider CICS/batch VSAM file-sharing, thus moving to a virtual batch window.
CICS treats a batch step as one transaction (albeit a transaction where many records can be processed). Companies are taking advantage of this concept by processing their batch cycle several times a day; in some cases as often as every two or three minutes. If you take a close look at how often data (which feeds the batch process) arrives at the IT operations area, you may find that running batch more often throughout the day is not only possible but also highly effective.
By sharing VSAM files inside CICS, batch jobs appear to CICS like any other online transaction. Batch jobs can then process while CICS continues to have full READ/WRITE access to VSAM files.
Since CICS has already been entrusted with your data, extending CICS to batch is a natural move. There are numerous advantages to this approach, including excellent performance, assurance of data integrity, straightforward implementation, and the inherent reliability and recovery benefits of CICS itself. If your batch cycle updates between 100 and 1,000,000 or more records per batch step, this concept is an excellent solution.
Architecture
When considering a VSAM file-sharing solution, architectural considerations are crucial for maintaining optimal performance as well as ensuring existing and future compatibility in an evolving mainframe environment. For VSAM file sharing to occur, three fundamental questions must be answered:
- How does the solution get control of the existing batch program to initiate the file sharing process?
- How will batch communicate with the CICS address spaces?
- Is a CICS component available to handle the batch request?
In other words, three main areas need to be addressed to determine the optimal architecture:
- The subsystem
- Batch-to-CICS communication
- The CICS component
A solution should support an IBM-documented and an IBM-supported subsystem that only awakens when its services are needed. The solution should not take up any cycles when batch-CICS file sharing is not taking place. It should also be able to be implemented in without changes to source code. Of course, there should be no hooks to the operating system (OS) that could cause future incompatibility issues when the OS changes from release to release.
Once a file-sharing batch job starts, the subsystem should gain control. Then initial communication between the batch job and CICS should be supported through TCP/IP or VTAM. The solution should then be smart enough to auto-determine if the batch job and the CICS region or regions are running under the same LPAR. If this is the case, the solution should be able to automatically utilize cross-memory services. TCP or VTAM are automatically used if multiple LPARs are involved. This approach ensures that the highest-performing communication is used.
Additionally, the CICS component should be just another CICS transaction. If the file-sharing solution conforms to the rules of CICS, it will be forward-compatible and backward-compatible with CICS releases.
By following this type of architectural approach, you establish a light system footprint, maintain high performance communication, and ensure forward-compatibility and backward-compatibility when CICS or the OS changes. This type of solution is flexible enough to transparently adapt from a simple LPAR environment to the most sophisticated parallel Sysplex/CICSplex with record-level sharing (RLS) and workload management.
Performance, reliability and scalability
Every organization that has embraced file sharing has concerns about system performance. In order to address this topic thoroughly in this paper, performance will be analyzed from a CICS perspective and then from a batch perspective.
The most trusted OLTP on the market — CICS is the most widely trusted and utilized online transaction processor (OLTP) on the market. IBM has invested more than 40 years into the continued development and improvement of CICS performance, reliability, and integrity. It follows that utilizing CICS to handle batch I/O would be the most logical approach when selecting a filesharing architecture. However, data centers that are tuned to manage millions of transactions a day in their CICS regions have reason to be wary. No one wants to break what works. CICS can handle file sharing with batch — it just needs to be configured correctly with the proper subsystem solution in order to ensure success.
In order to utilize CICS for batch file sharing and the additional traffic that entails, implement this simple, fundamental concept: tune according to established practices. Your batch file-sharing jobs are now an extension of CICS and will run as long as the CICS address spaces are tuned. For example, if it is appropriate to tune, you might add more strings to your FCT or increase your LSR buffers.
Setting CICS transaction priorities to ensure SLAs — Since your batch file-sharing job is now just another CICS transaction, you can set a priority for the file-sharing CICS transaction like any other CICS transaction. If there is a lot of traffic in CICS, this tells CICS how best to manage its resources. For example, if a batch file-sharing CICS transaction has a lower priority than another CICS transaction and there is competition for resources, CICS will automatically provide the higher-priority transaction(s) with the resources, and the batch file-sharing job will wait until the resources are available.
CICS ensures data integrity for all transactions — I/O requests are either successful or backed out. While the system updates a record, other transactions are put on hold. For a single transaction updating multiple VSAM records, the updates are either all successful or they are all backed out. When you add batch processing to the CICS environment, the same data integrity applies.
Sync points and performance — To CICS, a batch file-sharing job is a single transaction even though it is a long-running transaction. Most of the time batch jobs complete successfully. In very unusual situations, they do not. If transactions do not complete successfully, CICS automatically backs out any updates that batch makes. For example, when a batch step updates 1000 records, CICS will build up deferred work elements until the sync point is issued at the end of the batch step. If the batch job has an abnormal end-of-job (ABEND) during this process, before a sync point is issued, CICS Dynamic Transaction Backout (DTB) restores records to their original image and makes them available for other transactions.
For longer running jobs, the above scenario may not make sense. Some data centers have filesharing batch jobs that update millions of records during a given batch step. If these batch steps do not issue sync points, CICS will lock all records that were updated during the batch step, and CICS will build up deferred work elements to the point that they may consume all available CICS memory. As a result, CICS response time slows, and the CICS address space may crash due to a lack of storage.
The solution you implement must give you the ability to prevent this from happening. You want the capability to specify the number of units of work that will be processed before a sync point is issued. Equally important is the ability to adjust the sync point frequency without changing the application itself. Ideally, you want to implement a solution that allows you to associate the sync point frequency with a specific file in the batch job that when you read or write to that file, defines the completion of the unit of work. You could trigger the sync point frequency by simply adding a key word to the DD statement for the file in the batch JCL. Alternatively, you could add an entry in a control file that would accomplish the same task, if changing JCL is not an option.
Minimizing the affect of batch I/O on CICS performance — Let's take a closer look at how a good file-sharing solution will minimize CICS overhead associated with typical I/O activities you may encounter in common file-sharing scenarios. Many batch jobs are designed to read a record on a transaction file and then read the master file in a sequential order until a match on the record key is found. When this occurs, the business logic in the batch program executes then writes the updated master record.
This process repeats until all records on the transaction file have been processed. When this batch program uses file sharing, the read issued to the master file defined to CICS is translated into a CICS I/O instruction known as “get-for-update.” In terms of CICS overhead, this kind of read is expensive because the record is locked and then unlocked, increasing CPU cycles. If the master file that was just read does not match the key on the transaction file, CICS issues another get-forupdate read. This process repeats until the match is found.
A quality file-sharing solution will recognize this situation and avoid this extra overhead by converting the initial read in CICS to a “start browse/read next.” This specific CICS I/O instruction uses far fewer cycles in CICS. This is a simple example of how a quality file-sharing solution can minimize overhead in CICS and keep batch run times close to native processing times. Valuable file-sharing solutions recognize this technique and many other types of batch I/O processing techniques and use the one with the best-performing CICS I/O.
Sync points, recovery, and restartability
When optimizing batch jobs, you must consider the issues of using sync points, recovery, and restartability.
Sync points — It is best to use sync points on long-running batch jobs to optimize CICS performance and ensure that service-level agreements are met. However, using them creates a recovery situation that you must examine carefully. When sync points are issued, CICS is no longer responsible for the committed unit of work. This means you must create another process that either restores the data to its original condition and reruns the batch job, or one that restarts the batch job and picks up the process at the prior point of failure.
If you can simply rerun the job from the beginning without adverse affects on the data, this is not an issue. However, this is typically not the case. If you cannot rerun the job from the beginning, you have two options: You can allow for restartability where the job picks up processing again from the most recent, successful syncpoint, or you can look for a file-sharing solution that performs recovery for you. Let's take a closer look at each alternative.
Recovery — It is possible to have a file-sharing solution perform recovery for you. Any recovery process must preserve data integrity. Remember, if an ABEND occurs, CICS DTB will back out any in-flight unit of work. An in-flight unit of work is the records that batch is processing and that are not available for other transactions to use. When the program issues sync points, the updates that batch made to any records must be reversed or backed out. The next batch job to start after the initial ABEND occurs should back out these updates.
In order to implement recovery, you should adjust production JCL by adding a recovery step after each update step. While the recovery step is running, these files are still fully available to CICS. Running a batch recovery process while the file is available to CICS means that other transactions could be updating the same data fields that the recovery process is backing out. You want to select a file-sharing solution with a recovery process that recognizes these different situations and handles them appropriately. From a wall clock point of view, recovery will take longer than restartability.
When you compare which of these approaches is best for your organization, consider how often batch jobs ABEND, whether you have source code available for the batch programs, and whether you have programmers that understand the architecture of each batch program that needs to be changed. Also evaluate how long it will take your programmers to change the source code and then test those changes.
Restartability — Allowing the batch job to restart at the point of failure is a good solution. In order to implement this, however, you have to make the source code available for each batch program using the file-sharing approach and also have programmers available to change the code. Programmers need to understand the architecture of the programs and be able to add the restartability code to the program. When coding is complete, each scenario needs to be tested to ensure that the job was done correctly.
People who are familiar with restartability and the architecture of the batch program are the best resources for adding restartability to batch programs. If you include restartability in a batch program in a production environment, the recovered batch job will finish processing sooner because it restarts at the point of failure. With a virtual batch window, however, this may no longer be an issue.
CICSplex and Sysplex scalability
When you choose a solution, keep in mind that the best file-sharing solution will support past, current, and future IBM environments. This means it should work correctly in the CICSplex and Sysplex. It will also support IBM guidelines for LPAR pricing and usage monitoring, TCP/IP and/or VTAM networking communication, and function correctly with the workload manager (WLM). Day One support for new OS and CICS releases should also be on the list of requirements.
Conclusions: A better way
CICS and batch are two extremely reliable and stable processes. Performance is managed easily, the systems are understood, and expectations are already in place. All the practices you have developed over the years remain when you implement a VSAM file-sharing solution, and there is no need to drastically modify what you have already invested in, namely the mainframe itself, CICS, existing CICS and batch applications, and the people and skills who support everyday functions.
Using a solution that tightly integrates into CICS, you can take advantage of existing CICS services. Source code remains intact, SLAs are protected, there is minimal impact on existing processes, data integrity persists even after ABENDs, and the solution is scalable across CICSplex and Sysplex environments. The solution does not need to be redesigned or rearchitected when IBM changes or improves CICS.
The advantages of this solution are clear. Rip-and-replace is a risky, time-consuming, and costly undertaking. These ROI considerations alone are significant. The time and effort required to re-code applications or replace systems can be avoided. The user community requires no re-training, and existing IT processes and procedures can remain as well.
The world now expects data availability and application access on a 24/7 schedule. By sharing VSAM files inside of CICS, you can avoid the pain associated with alternative approaches while ensuring high performance levels, data integrity, implementation ease, reliability, and recoverability.
About H&W

Footnotes
- Arcati Ltd, Arcati Mainframe Yearbook 2018, Arcati Ltd, 2018.