In XML File with VBA
They wasted THOUSANDS of HOURS, no automation at all…
For years, system analysts in a company I used to work for, did the same grueling search over and over again. They had to write software system requirements documents (SRS documents) in which they had to describe APIs (Application programming interfaces).Each API had Input parameters, business logic, and output parameters. In many APIs a call to a legacy system was made. This legacy system stored all of the business data in a large data base, and the interface to the legacy system enabled querying for information and also performing various important business activities.The problem was the names of the parameters of the legacy system’s interface. This interface was built to accept and to output parameters with old names that where programmed into the code when the legacy system was built a long time ago, at the beginning of the business. No one wanted to change the legacy system’s code with new parameter names, because it was too critical and the company depended on it to continue serving its customers. The risks of breaking interfaces and causing bugs in the legacy system’s code and data base, were too high. Another alternative solution had to be found. So, the alternative solution was constructed as follows:
1.) A translation XML file was created with each parameter’s legacy name and the translation to this name in the form of a new name of the parameter as it is used in the input or output parameters of the APIs that are using this parameter.
2.) The XML structure relating to what is described above look like this:
The XML elements meaning is:
3.) A web application was created for the system analysts to search in the XML described above.
The system analysts used the application to get all of the OLD parameter names of the legacy system interface, and then search for the NEW parameter names. They had to take an OLD parameter name, write it in the search box of the web application, and then click the search button to get the NEW parameter name. They had to do it ONE parameter at a time, ONE by ONE, a grueling search for each and every API. They wasted THOUSANDS of HOURS on this kind of search for YEARS and YEARS…
The developers of the web application did not think about creating a way to search for MULTIPLE NEW names in one search.
The OLD names and the NEW names had to be documented inside the XML file for future use and also it had to be written in the system requirements documents (SRS documents) that the system analysts wrote for the developers. In turn, the developers used the SRS documents to code the new APIs with the NEW parameter names.The system analysts had to describe the OLD names and the corresponding NEW names in their SRS document as follows:
Image 1 – the parameters table in the SRS documents
The automation solution was just around the corner, but nobody saw it (or tried to see it).
Nobody thought about creating a simple and effective solution for this grueling search, although it was just around the corner, in something that was used each and every day, EXCEL’s VBA.
When I joined the company as a system analyst, I already knew how to code in various languages and VBA was also one of them, All I had to figure out is how to get the updated XML file with all of the OLD and NEW names mapped correspondingly as described above. I have found out that the XML file was updated on a daily basis with every addition of new [OLD name] -> [NEW name] mapping.
Enter automation using EXCEL’s VBA
Besides the “Settings” sheet, I also use another sheet to hold the list of the OLD names. This sheet is named “ESB” i.e. Enterprise Service Bus. The ESB is a software implementation that serves as a service mediator between the legacy system and other business applications such as the applications here that implement various APIs to extract data from the legacy system’s data base. In this context, a NEW business application will call a service exposed by the ESB in order to communicate with the legacy system’s code. The ESB will channel and handle the communication between the two applications, transfer the required input parameters to the legacy system’s code, then the business application will get the data output it needs and expose it in its API output to whoever called it, i.e. some other application that will use the NEW API with the NEW Names of the input and output parameters.
How to use the automation VBA Excel sheet
How to use the new automation VBA Excel sheet
In order to use this automation sheet, the user must fill the “Settings” sheet as follows:
1. Enter the path to the dictionary XML file (either on a local PC or in a network location) under the “XML Path” column.In image 1 above it is “C:\Dictionary.xml”
2. Enter the column letter that contains the OLD names in the “ESB” sheet (I will elaborate more on the structure of the “ESB” sheet below), under the column “OLD Column Letter”. In image 1 above the column letter is “A”.
3. Enter the row number from which the VBA code will need to start the iteration over the list of the OLD names in the “ESB” sheet. This has to be written under the column “OLD Row START”. In image 1 above the row to start is row 10.
4. Enter the row number in which the VBA code will need to end the iteration over the list of the OLD names in the “ESB” sheet. This has to be written under the column “OLD Row END”. In image 1 above the row to end is row 300.
5. Enter the NUMBER of the column in the “ESB” sheet (starting the count from the left column in a 1 based count, not zero based count) that the VBA code should insert the NEW Name that was found in the XML file (according to the corresponding OLD Name). It has to be written under the column “Result Column”.
6. The following two entries require an explanation. It turned out that sometimes the ESB system implemented a service interface to the legacy system by using the NEW parameter Names. Hence, in order to document every OLD name and its corresponding NEW name as depicted in the table at image 1 above, the system analysts needed a way to reverse the search in the XML file, i.e. find the OLD names by searching for the NEW names element and then going to the next element (next sibling instead of previous sibling) to the OLD name element and extracting its text. So, if an “OLD to NEW” search had to be used, the user must enter an UPPER CASE “Y” letter under the “OLD to NEW” column and an UPPER CASE “N” letter under the “NEW to OLD” column, and vice versa. This is a perfect example why planning a GENERIC solution in the form of a “Settings” sheet can be adapted to tackle new and unexpected requirements. A new feature can be easily added and a new entry of setting can be created in the “Settings” sheet.
7. Finally, the user must get a list of the parameter names from the ESB service interface, and copy them into the “ESB” sheet, and click the “Search & Extract” button on the right side of the entries in the “Settings” sheet. Now, the magic happens and the names found in the XML file are copied to the “Result Column” in the “ESB” sheet.
All the VBA code was written in the “ESB” Sheet1’s code space as you see in image 3 below.
Let’s dive in to the VBA code and see how the automation magic happens
All the VBA code was written in the “ESB” Sheet1’s code space as you see in image 3 below.
Image 3 – automating search in XML file VBA project structure
Image 4 – The references dialog box (shown after clicking: Tools menu->References)
As a side note in reference to the xmlNodeList declared above, I have found out that the XML could contain more than one NEW parameter name for ONE corresponding OLD parameter name. I think it is good that you’ll also see how to handle a list of XML nodes. As you’ll see, I output the names from the xmlNodeList as comma separated string for example: newName1,newName2,newName3,newName4, etc…
Now, in order to iterate over the list of OLD names that is found in the “ESB” sheet, a dynamic range is built from the old_column_letter, old_row_start and the old_row_end values that where collected from the settings sheet. The result is a column range in the “ESB” sheet that looks like, for example, “A1:A300”. This is a perfect example why you should consider a “Settings” sheet for your VBA automations in Excel. The user is free to choose what section of the list (or the whole list), should be processed. This generic approach is always much better than hard coded values. The code that does this is as follows:
new_text = oXMLDom.SelectSingleNode(“/dictionary/dictionaryData/oldName” & _
“[translate(text(),” & strTranslate & “)='” & UCase(old_text) & “‘]”).PreviousSibling.text>
Set xmlNodeList = oXMLDom.SelectNodes(“/dictionary/dictionaryData/oldName” & _
“[translate(text(),” & strTranslate & “)='” & UCase(old_text) & “‘]”)End If
strTranslate = “‘abcdefghijklmnopqrstuvwxyz’,’ABCDEFGHIJKLMNOPQRSTUVWXYZ'” Second Argument = ‘abcdefghijklmnopqrstuvwxyz’ Third argument = ‘ABCDEFGHIJKLMNOPQRSTUVWXYZ’The Third argument’s purpose, is to instruct HOW to replace the text that is found in the First argument according to the Second argument. So, any occurrence of ‘abcdefghijklmnopqrstuvwxyz’ in the text of the element (text()) , will be replaced with ‘ABCDEFGHIJKLMNOPQRSTUVWXYZ’ (an uppercase letter).The aim here is to transform the text of the XML node into UPPER CASE, this way the comparison will find the required node with the OLD name, because the searched text is upper cased i.e. UCase(old_text).
So, since each XML node’s text contains alphabetic character ‘abcdefghijklmnopqrstuvwxyz’ , it will be translated into an upper case character according to the third argument ‘ABCDEFGHIJKLMNOPQRSTUVWXYZ’ and then compared to UCase(old_text).This is how you circumvent the case problem by always comparing text with the same case.
If the OLD text was found in some XML element it is stored inside the new_text variable after moving to the PreviousSibling i.e. one element back to get to the newName element. Recall that the structure of the XML is as follows:
if MORE than one XML element contains the same searched text than these XML elements will be stored in the “xmlNodeList” variable. Then, if the “xmlNodeList” length is bigger than one, the text will be extracted from each node and a comma separated string will be created inside the “new_text” variable as follows:
If xmlNodeList.Length > 1 Then
new_text = “”
For Each Node In xmlNodeList
””OLD to NEW
If old_to_new = “Y” Then
new_text = new_text & Node.PreviousSibling.text & “,”
””NEW to OLD
If new_to_old = “Y” Then
new_text = new_text & Node.NextSibling.text & “,”
”’removing the last comma at the end
new_text = Left(new_text, Len(new_text) – 1)
Set xmlNodeList = Nothing
Finally the text that was found, either in the form of one name or a comma separated names is outputted to the proper cell in the “ESB” sheet as follows:
The rowidx is a variable that is incremented with every iteration of the for each loop, starting from the first row that the user wrote in the “Settings” sheet.
In case no text is found a jump to the “continue” label prevent the text from being outputted to the “ESB” sheet. The whole last part of the loop is as follows:
esb_sht.Cells(rowidx, new_column_index).Value = new_textEnd Ifnew_text = “”continue:rowidx = rowidx + 1pctdone = pb_idx / lastrowWith ufProgress
.LabelCaption.Caption = “Processing Row ” & pb_idx & ” of ” & lastrow .LabelProgress.Width = pctdone * (.FrameProgress.Width)End WithDoEventspb_idx = pb_idx + 1If pb_idx = lastrow Then
Unload ufProgressEnd If”this is the end of the for each loopNextSet oXMLDom = Nothing
You can see a sample result after executing the macro, this is how the “ESB” sheet looks like with the “NEW Name”
values in column C, that were found in the XML for the corresponding “OLD Name” values
Image 5 – Sample result in the “ESB” sheet after the execution of the macro
The XML for the result shown here is as follows:
Go ahead and Register below to my email list and download the Excel file with the complete macro for FREE, you’ll be able to use it right away!!!