Strippin’ Tags

A while back, I helped create a Django back end for an update of an existing website.  The website’s front end was AngularJs, and we didn’t really get how the two should integrate.  The result?  A bit of a mess.  Django template views inside Angular html template files.  We had to modify the bracket style used for Angular (as it conflicts with Django’s).  The folder structure was confusing, the code was confusing- it worked, but it wasn’t a great solution.

Now I understand what we should have done: use Django to create an API for the Angular front end.  Just return that sweet, sweet JSON to the app and have Angular do the templating.  We are in the process of this update (as well as migrating this one from AngularJS to Angular the Next Generation, and I ran into an interesting project.

The original site had a Django-driven blog.  This was not integrated into the main AngularJS app in any way- a user clicked the ‘news’ link and they were taken to a completely new page with a traditional Django blog setup.  It made creation and updating easy- we just used Django’s admin panel for new posts, and the default templating views to handle any sorting (by category, date, etc).

But it doesn’t really fit.  Having the blog as a module within the Angular (now to be v4.2.4) application makes more sense.  With our new API approach it will work, but it will take some extra work.

One aspect is the admin panel.  We won’t be using the built in Django admin panel- instead, we’ve created an Angular-driven admin panel.  Converting the data returned by Django’s ORM to JSON was a bit tricky at first, but it’s flowing smoothly at this point (might cover that in a future post, as it was a bit of a process).

Another aspect is searching through blog posts (on the user interface), and that’s where we come to the cool project of the week.  One of the benefits of doing this extra work is getting the hip “instant updates” feel of a single page app within our blog’s UI.  When someone types in the search bar, they see the blog list below filter immediately.

But I noticed some test posts were coming up in almost all search results.  They happened to be the test posts with images in them.  The reason?  We are using an HTML editor toolbar for the admin area.  When a site admin posts a new blog, they use this toolbar to format text, add links, or post images (not everyone posting will be a developer or have source code access).

The toolbar has a cool feature where it encodes any images uploaded to base-64 format.  According to Wikipedia,  “Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation.

I don’t really know what that means, but I do know it means I don’t have to set up an ‘images’ folder for my blog module and save a path to those images with a relation to the blog post in my database.  When it converts to base-64, it gives me a string of text that a modern browser can translate into an image.  That image src can be saved right along with the rest of the HTML body.

Very helpful – but back to the search feature.  It turns out that the search was iterating over everything stored in the blog’s body- including the html code.  I would have had to fix this anyway, but it became really apparent with the image issue.  Because the string of text representing an image was so long, it naturally contained matches to most strings I was searching (at least the simple ones people would start with).  It might not have mattered on a traditional request-response site- a person would type their whole search string then submit and avoid seeing the wrong results.  But in a live reload search, the problem was obvious.

My first thought was a terrible one: Why not store a plain text version of the blog body in the database?  That way, we just return it alongside the rest of the JSON data and use that for the search.  But that really doesn’t make sense- it’s just bloating our database tables with duplicate info and increasing the size of the JSON object returned on each request.

Then I remembered how cool Javascript can be- and that it comes with awesome built in array features like ‘map’.  So, why not return the data as we have been (with the body as HTML) and manipulate the body just within the search function?  We can search over that manipulated body, but display the original HTML.

Turns out that this works pretty well.  In our main blog component, we initialize a search array on a service:

this.blogViewService.searchArray = ['author', 'title', 'body_plain_text', 'category'];

This is a convention we’re using on a different project- the array contains the property names we want to search by (those property names appear within each object in the array we will be searching over). That searchArray is passed to our search service with some other info.  In this case, ‘body_plain_text’ doesn’t exist on the object- but we’re going to create it as we go.

For the search, we cheat a bit and use Angular’s built in form input subscription.  A form input field in Angular has an observable you can hook into to get the data as a user types (called valueChanges).  You can then subscribe and do any searching there.  All we need to do is make sure to transform each object a bit in the process:

this.blogViewService.searchSubscription = this.blogViewService.term.valueChanges
    .debounceTime(200)
    .subscribe(result => {
        let itemsToSearch = this.blogViewService.originalItems
            .map(item => {
                item['body_plain_text'] = String(item['body']).replace(/<[^>]+>/gm, ' ');
                return item;
            });
        //we pass to another service to do the actual filtering of originalItems here
    });

We start with assigning a reference to this subscription (searchSubscription) so we can unsubscribe on the component’s destruction (to avoid memory leaks).  Then we hook into the form input observable (valueChanges).  I put a slight delay on the process at this point with debounceTime(200)- when the news/blog section gets big enough that returning it all up front doesn’t make sense, we will have to hit the database in this search.  debounceTime(timeHereInMs) is a great rxJS built in that handles debouncing your calls.

Finally, we get to the actual change- the originalItems array is mapped, but none of the properties within each object are actually changed.  Instead, we append a ‘body_plain_text’ property to each object that uses a regex to strip HTML tags.  Originally, it replaced with nothing, but then we had words joining together (if they were on the other side of tags), so replacing with a single blank space preserves the integrity of the search.  We never change the original ‘body’ property- this is where our HTML lives and is used for the actual display.

I’m sure there will be edge cases where this might not work and we have to tweak the process, but it’s a good start.  I also don’t think this is technically how you’re supposed to use .map- it’s a functional programming staple, and I’m using it to append a property to an object then return it back to the array.  Definitely not functional programming!

At your service

So far, we’ve used services in our Angular 2 application to share data between components.  It’s one of the recommended ways of doing so by the official angular.io documentation (along with things like Input/Output and Event Emitter).  And so far, I think I like the ‘shared service’ option the best.  It encourages the use of Observables, which I’m coming to like more and more.

Side note- it seems like a lot of the use cases that surround Observables involve streams of data coming from a server (or database as a service, like Firebase).  But they can really be used for a whole lot more (mouse events, drag and drop, etc).  In some of our Angular 2 views, we will use an Observable (and corresponding subscription) to ensure data is updated between components on that same view.

Example: one screen has a set of charts (created in d3.js – future post about that awesome library as soon as I rise above the ‘bumbling moron’ level of proficiency) above a listing of activity types.  An admin can edit those activity types- when they do, it fires the .next method on an Observable the chart (same screen) subscribes to.  That way, the chart updates as edits are made below.  In a traditional request-response type website setup, this wouldn’t have been an issue, but this would also have required a full refresh from the server!

Anyway, the point of this post wasn’t supposed to be a commercial for RxJS (also- Observables are not just for Javascript- they have libraries for just about everyone!).  The point was regarding services in Angular 2.  Generally, when you want to share data between components, you’ll declare the service in your providers array at the module level.  If I’m understanding correctly, that will create one instance of the service (a singleton function, I think?) which enables the sharing capability.  However, sometimes this isn’t what you want.

Example: we created a PaginationService to provide some of the basics of creating ‘pages’ (viewing 10/25/50 items onscreen at a time and allowing scrolling around by links for more).  But when declared at a shared, module level, it doesn’t really work right.  In this case, each component should have its own instance of the service (you don’t want changing pages on one view to also change another- that gets very confusing very quickly).

So, after swearing at the screen for a few minutes, I realized the answer was simply to move the PaginationService out of the module’s providers array and into the individual component’s providers array.  Each component gets its own instance and it works like a charm!