Skip to Main Content

UFDC XML Batch Updates: Home

Steps and tools for updating UFDC XML records by batch

Process

1. Prepare the spreadsheet for creating "set" in UF Libraries Digital Metadata Steward (DMS)/the Steward

a. Two columns, one bibid and one vid. The headers are all in lowercase. No any formatting of the cell that includes but not limited to: font type change, font color change or cell background color, see an example below. 

bibid vid
AA00000009 00001
AA00000204 00001
AA00000211 00001
AA00000212 00001
AA00000213 00001

        

b. Prepare 100 rows/records/vids a batch. Name the files in the matter easy to track the progress, for instance, ProjectKeywords + "_" + Number, eg. "Nonfiction_001"

c. Track the batch assignment using the original Excel or a copy of it that lists all bibs and vids of the batch update project. 

2. Login to UF Libraries Digital Metadata Steward (DMS)/the Steward

3. In the Steward

click Batch Items--> click IMPORT button (at the right right top corner)--> Choose File to import --> Format, usually "xlsx" (check the spreadsheet file name) --> click SUBMIT

If the next screen shows all items in green, that means this batch has been updated successfully. Write down the 4 digit BATCH SET number available at the beginning of each row, eg, "4315"

This step informs the Steward the bibs and vids you want to process. 

4. In the Steward

Back to Home --> click  Items zips --> click ADD ITEMS ZIP

 --> choose the right Batch set number from the dropdown list

--> Batch item count high can be used to decide how many records will be processed during this round. If the spreadsheet prepared in Step 1 is a good batch of 100 records and you want to process them all, you can leave it blank. 

--> make sure Glob as shown below set as "*.mets.xml"

--> click SAVE

This step tells the Steward that you want it to pull out the mets. xmls for the bibs and vids you submitted during the step 2. 

5. In the Steward

Back to Items zips --> Find the Job ID, three digits and the Job Status highlighted in yellow below. When the Job's End Datetime is available, the job is done. Usually it takes only a couple of mins for the Steward to pull out mets.xmls from the resource directory folders.  

6. On the DLC Drive. This step copy and paste the xmls for edits. 

Locate the Item zip job folder at this path: Y:DLC\Main\MARSHALING_WORK\itemszip --> copy the whole folder that with the right three digits job ID to your desktop or drive. Leave the item zip folder on DLC intact as the ultimate source for the original xmls before edits. 

7. In Oxygen XML Editor: Update the RECORDSTATUS: this is required for all batches. 

     Update the following RECORDSTATUS in XML, "NEW", "PARTIAL", "COMPLETE" to "METADATA_UPDATE"

Open one xml file --> Copy the chunk need to be replaced or removed, for instance "RECORDSTATUS="PARTIAL"",

--> Find

--> Find and Replace in Files

-->Text to find: the copied content should already be there

--> Replace with: put in the replace content "RECORDSTATUS="METADATA_UPDATE"" (For this function: leave it blank can remove the copied content )-

-> click "Find All" to get a sense the records that need edits, record this number in the original tracking spreadsheet

--> come back to  Find and Replace in Files to click "Replace All" --> Preview, to review the changes in a few xmls --> OK to carry out the edits.

8. In Oxygen XML Editor: conduct the wanted change. 

8. On the DLC Drive

Load the xmls at the path below: 

Y:DLC\Main\INCOMING\Metadata

Please note, Sobek builder only accepts XML folders named and organized as shown below, so when loading XMLs, click open the job folder to copy directly the XML folders. 

 

Later, when one load is done, locate the errors in the MetadataFailures folder (Y:DLC\Main\INCOMING\MetadataFailrues) using the date that the files were added, record the numbers of errors in the original tracking spreadsheet. 

 

University of Florida Home Page

This page uses Google Analytics - (Google Privacy Policy)

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.