By April Eggers, Senior Software Developer, FDI
As a front-end developer, I use JSON – a lot. For simple applications, looking at JSON strings in Notepad++ or pasting them into jsonlint.com is all I need, but when I’m building big applications, talking to multiple REST endpoints, designing unit tests, mock objects and doing a bunch of modeling at the intersection of a UI and back end services, I can quickly find myself flooded with too many JSON responses for those solutions to be viable. I recently took some time to solve the problem of how to view all this data in a way that’s efficient and easy to understand.
TL; DR If you’re feeling lucky, here’s my handy solution to replace raw JSON with formatted JSON. If you’re feeling lucky but cautious, see the disclaimers at the end of this article. If you’re feeling lucky but distracted, and like a good story, read on.
I had 96 unformatted JSON files that looked something like this (generated with https://www.json-generator.com/):
And I wanted them to look like this:
At first, I thought that I’d only want to view several of the files, so I looked for an Emacs solution to pretty print JSON. This is a trivial task with the JSTool plugin for Notepad++, and I assumed it would be a well-solved problem for Emacs, so I went to the Googles. The solutions that I found involved either the Python module json.tool or major modes, both of which were new concepts to me, and neither of which actually formatted my JSON. After trying to format my JSON with Emacs, then later Gedit, for some time, I gave up, copied several of the files to my Windows VM host, and formatted them with Notepad++ (Ctrl + Alt + m is the default keybinding, in case you’re wondering and/or have an aversion to running commands by clicking through menus).
Today, I found myself back at this problem again, when I was trying to grep my files to determine all the values for a certain property. By default, grep’ing an unformatted JSON file will output the entire contents of the file if the grep regular expression finds a match. That “feature” made it difficult to visually parse the results that I was trying to find. Armed with caffeine and enough Bash knowledge to be dangerous, I rolled up my sleeves (metaphorically because it was quite warm today, and I was wearing a short-sleeved shirt), and dove in (again, metaphorically because, unfortunately, I neither own a magic school bus that can shrink down and fit inside computers, nor is my computer a magical chalk drawing in Mary Poppins).
My vague plan was to pipe each file in the directory to the Python module json.tool, which would format the JSON, and then I could grab the formatted JSON and stuff it back into the original file. The first difficulty that I encountered was trying to get json.tool to format one file, to standard out. The not so secret, secret seems to be running the command exactly as you would expect. Note, you may not have expected the -m flag. It signals Python to run a library module as a script (according to python -h).
I’m not sure exactly where it was going south for me, but I saw the error
Disheartening, to say the least. Eventually the code gremlins stepped aside, and I was formatting JSON like a boss.
The next step was to format all the files in the directory, instead of formatting files one by one. My initial thought was to use sed with the -i (in place) flag. However, sed generally expects a find and replace regex, not a Bash command that runs a Python script. Half-hearted attempts to use the -f (script file) or -e (script) flags were unsuccessful. Finally, I resigned myself, girded my loins and attempted an inline for loop.
Generally, I try to avoid multi-line commands in the terminal because it’s difficult to arrow up through the command history and make a change. So I wanted to keep my for loop on a single line for fast modifications.
My first attempt,
saw the return of the dreaded “No JSON object could be decoded” error. As far as I could determine, this was because I wasn’t dereferencing i. At all. $ is your friend, people.
Seems more correct, right? Look at that handy dandy dereferencing happening. Alas, still no love. From what I could tell, something about this syntax seemed to cause Python to think that the filename was actually the JSON that I wanted to pretty print.
My next attempt was to read the file and pipe it to my Python command.
Success! Back to formatting like a boss! Now all I had to do was overwrite the original file with the formatted JSON, and I’d be home free.
Various efforts to pipe/redirect my formatted JSON back into its file failed for the simple reason that you can’t simultaneously read from and overwrite a file. Who knew? Well, I knew, but I was hoping that the magic of Bash pipes and redirects would completely read the file in, process it, then write back to the file, not attempt simultaneous reading and writing.
The end result was this command that writes the pretty printed JSON to a temp file, then writes the contents of the temp file over the original, raw JSON file.
Some disclaimers, provisos, and, a couple of quid pro quos:
- If your JSON is not formatting, and you try to stuff that result into a file, you’ll end up with an empty file (and your raw JSON will be gone). So, before attempting anything, back up your JSON directory!
- I am not a Bash expert by any stretch of the imagination. I’m sure there are better, faster, stronger ways to do this. I went with the first thing that worked for me because it worked. It may not meet your use case.
- There’s obviously no error checking happening. Again, I had a very specific use case, which this meets. I don’t know what would happen if you ran this on an empty directory, or if your directory contained more than just JSON files.
- My version of Linux (Centos 7), had Python and the json.tool module installed, which is why I didn’t mention anything about what I had to install. You may need to install things for this method to Your version of Python might not work (my version of Python is 2.7.5). I’m also using Xfce, instead of the default terminal for Centos, so that’s another place that may cause hiccups.
Lessons learned from this exercise? In no particular order:
- There are lots of great tools for manipulating JSON, using Python modules, plug-ins, Your tooling may be constrained by your development tooling, but it’s really important that you develop some tooling and expertise for batch processing of JSON messages.
- Be flexible in your approach. If you get stuck, restart with a different approach!
- Always, always have a back-out strategy! If you spend hours generating GBs of test JSON and then your tooling inadvertently overwrites/deletes your data, You Will Be Sad. A bit of forethought on protecting your data will save you hours of time
April Eggers is a senior software developer at FDI, and an Alfresco Certified Engineer. As a front-end developer, she’s worked with JSON responses from many different servers, including OpenText InfoArchive, Alfresco’s enterprise content management system, Alfresco’s workflow engine, and Activiti. When April’s not playing with the latest Javascript framework, she enjoys fostering kittens from the local humane society, smiling at every dog she sees, and thinking up woodworking projects.