Category Archives: Web Development

I come across random techniques while working and decided to start blogging them… For me to find later, for others who may need them and with the hope to get some feedback.

Demystifying the WordPress Plugin

I asked one of the guys on my team to build a WordPress plugin, he balked. I realized that he just didn’t get how easy it is. So I thought I’d dedicate some time to dispelling the myth that plugins are complicated to write. “Plugin” sounds formidable. But really, a plugin is a file in the plugins directory with a comment. That’s it. To build your first plugin create a file, name it whatever you like, and place the following code at the top:

<?php
/**
* Plugin Name: Name Of The Plugin
* Plugin URI: http://URI_Of_Page_Describing_Plugin_and_Updates
* Description: A brief description of the Plugin.
* Version: The Plugin's Version Number, e.g.: 1.0
* Author: Name Of The Plugin Author
* Author URI: http://URI_Of_The_Plugin_Author
* License: A "Slug" license name e.g. GPL2
*/

Upload that file to "YOUR-SITE-ROOT/wp-content/plugins/" and voilà!

Screenshot 2014-06-29 15.55.47

Go to your plugins menu and you’ll see your shiny new plugin there.

What do you put in there besides the comment? Typically, anything you’d put in your theme’s functions.php file. Unlike with the functions.php file which runs everything you put in it, where plugins are concerned you can decide whether to “activate” the code or not.

Here’s a great post with 25+ potential plugins. Let’s try #3 (Don’t forget to change the plugin’s name):

<?php
/**
* Plugin Name: Remove WordPress Version Number
* Plugin URI: http://URI_Of_Page_Describing_Plugin_and_Updates
* Description: A brief description of the Plugin.
* Version: The Plugin's Version Number, e.g.: 1.0
* Author: Name Of The Plugin Author
* Author URI: http://URI_Of_The_Plugin_Author
* License: A "Slug" license name e.g. GPL2
*/
function wpbeginner_remove_version() {
  return '';
}
add_filter('the_generator', 'wpbeginner_remove_version');

See! Wasn’t that easy?

Typically, you’ll want to add to your functions.php file code that is necessary for your theme to run, and make plugins for all functionality that is separate from the theme.

There’s a lot more to say about plugins, the example above is rather simple, but it should get you on your way.

 


    Database Indices

    I recently had the pleasure of indexing our company’s sites’ tables. The  custom CMS I inherited has some really brilliant code, but it also has quite a large amount of idiotic code blocks as well. In this case, the database was not thought out as best it could have.

    I was able to reduce some ridiculously long queries from 10 seconds down to 2 seconds with a few well placed indices. While 2 seconds isn’t anything to brag about. It’s quite an improvement. A database index is a lookup table that is created to help query information more quickly.

    Before jumping in. One thing to note about indices is that there is an expense. If you go ahead and index your entire table it will likely take longer to run that query than a non-indexed table. So you have to be discerning about what you index.

    The rule of thumb is, anything that alters the query, typically a WHERE clause or ORDER or GROUP, is a potential candidate for an index. For me, the rest of the process was trial and error, I went through all the potential candidates in my longest queries and tried various combinations until I came up with the quickest time for each query. There may be an algorithm for calculating the best columns to index. But this worked.

    phpMyAdmin doesn’t make it easy to find your indices.

    ( Don’t get any funny ideas, this is just a typical WordPress install. )

    1) Open a table, click “Structure” at the top.

    where are you?

    2) At the bottom of the page is a tiny link that says “Indexes”. You can manage existing indices from here.

    oh there you are

    3) To add an index check to see if there’s a “More” menu on the column you’d like to index…

    and how do I do that?

    Enjoy!


      Leveling Up Your Development Skills with a Pinch of SysAdmin

       “If you’re developing properly, you shouldn’t have to worry about whether your site is running Apache, Nginx or anything else. Your code should just work.”
      - anon

      When developing a plugin or theme for use by the greater community this is certainly true. Anything you put out there for someone else, should absolutely not be dependent on the platform.

      However, if you already have a platform running, and your code is serving a purpose other than to be code for other people (a plugin or theme) to use on their sites — like if you are running a service yourself — then you want to make sure your code will run on your production server before pushing it there. That’s why I keep my servers running the same infrastructure. I have people who rely on my site running as expected.

      I have a Virtual Machine (VM) running on which I develop. VMs are great because you can control 100% what is running on it.

      Developing on a Mac is fun because it’s core is Unix-esk. Many developers will just use the PHP that is right there, or use MAMP. MAMP is a very good way to get started.

      A native setup like that works 99% percent of the time. But I found that in a few edge cases, if you’re not developing in the same environment that your production server runs on, it can make debugging complicated… and I HATE debugging my production server live. If you do that you might as well skip all other layers and code commando, straight on production.

      Another benefit of running your own VM is that you learn what goes on under the hood. Sometimes your code isn’t the only thing responsible for speed and performance. If you’re on a shared host, there’s a lot you can’t do. Once you practice on a local development environment, you might just find that you’ve built up enough gumption to run your own live server yourself.

      The lowest tier on Digital Ocean is certainly comparable in price to any shared hosting. The benefits are nice, though. Want to try out Nginx instead of Apache? Sure! Want to use fast-cgi instead of running PHP on top of Apache? Go for it? How about just a simple APC install? No arguing on the phone with customer support. Just do it!

      Doing this IS scary. The buck stops with YOU. Make sure you have proper fail safes, backups, etc. Digital Ocean has daily backups, which is nice. But if you’re hacked or you get a Reddit bump you need to handle it yourself.

      Doing this also means that you will be competent to spin up multiple environments for testing. Thus validating the quote at the beginning of the post.

      So how do I do that?

      • On my Mac I run VM Fusion. I found it far superior for this than Parallels for running a local server.
      • Digital Ocean has wonderful tutorials for spinning up servers. Try the LAMP stack. Want to run Nginx? No problem. I recommend trying several of these, a few times. The cool thing with a VM is that you can delete it and begin again, as many times as you need. Get to a good place? Take a snapshot and roll on.
      • I Deploy with git. Which basically consists of making sure your local server can SSH to your live server(s). Since most of the people who use the sites I manage, manage the content themselves I don’t have to worry about syncing databases, so I haven’t worked out a solution for that.
      • When I work with a team on a project I’ll typically have a staging server. It’s basically a clone of production, only without an easily accessible url. We coordinate with a central repository and test on the staging server. When code is ready to be shipped. It’s pushed to production.

      When I keep everything the same, I won’t have to worry about deploying. If it works locally, and it works in the staging area, I can be assured that it’ll work on production.

      I’d love to hear your thoughts. Discuss:

       


        Handling a PHP unserialize offset error… and why it happens

        I  discovered recently the importance of proper collation of database tables. I inherited a proprietary CMS to manage. The default collation was latin1_swedish_ci. Apparently it’s because “The bloke who wrote it was co-head of a Swedish company“. The problem occurred when a form we had on our site began getting submissions with foreign characters. The database collation couldn’t accept the characters and was saving them as question marks (?).

        Serialization is the process of translating data structures or object state into a format that can be stored.” For example the array:

        $returnValue = serialize(array('hello', 'world'));

        Will become:

        a:2:{i:0;s:5:"hello";i:1;s:5:"world";}

        This is what the above string means:

        • There is an array that is 2 in length. a:2.
        • The first item in the array has a key that is an integer with the value of 0. i:0.
        • The value for that item is a string that is 5 characters long, which is “hello”. s:5.
        • The second item in the array has a key that is an integer with the value of 1. i:1.
        • The value for that item is a string that is 5 characters long, which is “world”. s:5.

        An unserialize offset error can occur when the string count in the serialized data does not match the length of the string being saved. so in the above example that would look like this:

        a:2:{i:0;s:4:"hello";i:1;s:5:"world";}

        Notice the number ’4′, while there are really 5 characters in the world ‘hello’.

        So the question is, why would the offset happen when a ? replaces a foreign character?

        To understand why, you need to dig into how UTF-8 works and things will become clear.

        The UTF-8 value of ‘?’ is ’3f’, while the value for ‘Æ’ is ‘c3 86′. '?' translates into s:1:"?"; while 'Æ' translates into s:2:"Æ";. Notice the 2 replacing the 1 in the string length. So basically, what’s happening is that when php serializes the data it is storing the foreign character as a double the length but when it’s passed to MySQL, when the table isn’t formatted for UTF-8, the database converts the character to a ?, which is then stored as a single character. But the serialization length is not updated, so when you go and unserialize the data there is an offset error.

        How to resolve the problem

        There are several articles that provide solutions. The most popular is to use the base64_encode() function around the serialized data. This will prevent the data from getting corrupted since base64 converts the data to ASCII which any collation can take.

        //to safely serialize
        $safe_string_to_store = base64_encode(serialize($multidimensional_array));
        
        //to unserialize...
        $array_restored_from_db = unserialize(base64_decode($encoded_serialized_string));

        If you don’t have access to your database, or don’t want to fool with it, this is a great solution. You can also set your table collation to utf8_general_ci or utf8_unicode_ci and that should solve your problem as well (that’s what we did).

        But what if you already have bad data in your database, like we had, and you’re getting the horrid ‘Notice: unserialize() [function.unserialize]: Error at Offset’ error. When you get this notice, chances are you’re not getting all your data either…

        Here’s what you do:

        $fixed_serialized_data = preg_replace_callback ( '!s:(\d+):"(.*?)";!',
            function($match) {
                return ($match[1] == strlen($match[2])) ? $match[0] : 's:' . strlen($match[2]) . ':"' . $match[2] . '";';
            },
        $error_serialized_data );
        

        This will search out the strings, recount the length, and replace the string length with the correct value. Unfortunately it cannot recover what the original foreign character was, but at least the rest of your data will load.

        I got the original code from StackOverflow, but since PHP 5.5 the /e modifier in preg_replace() has been deprecated completely and the original preg_match statement suggested will error out. So I rewrote it with preg_replace_callback().


          Using a post-receive Git hook to mark a deployment in NewRelic

          I recently started monitoring my systems with NewRelic. Fantastic tool.

          One fun feature they provide is that you can mark in NewRelic’s dashboard when you’ve deployed new code. This way you can compare your site performance before and after the deploy.

          curl -H "x-api-key:YOUR_API_KEY_HERE" -d "deployment[app_name]=iMyFace.ly Production" -d "deployment[description]=This deployment was sent using curl" -d "deployment[changelog]=many hands make light work" -d "deployment[user]=Joe User" https://api.newrelic.com/deployments.xml

          Using Git’s post-receive hook is perfect for this, especially since I already use it to deploy my sites to the various servers.

          The only question I had was, how would I get the various variables from the post-receive hook into the curl statement?

          Well, here you go:

          description=$(git log -1 --pretty=format:%s)
          author=$(git log -1 --pretty=format:%cn)
          revision=$(git log -1 --pretty=format:%T)

          Now you can do this:

          curl -H "x-api-key:YOUR_API_KEY_HERE" -d "deployment[app_name]=iMyFace.ly Production" -d "deployment[description]=$description" -d "deployment[user]=$author" -d"deployment[revision]=$revision" https://api.newrelic.com/deployments.xml

            Introducing Assets Manager for WordPress

            Note: if the links aren’t working properly, resave the pretty permalinks settings.

            Download

            Many of the companies which my current place of employment interacts with have a higher level of security on their firewall (they also tend to use IE7, such is life). Because of this we were having issues sharing files with our constituents using the current industry file sharing tools.

            To solve this problem I was tasked with creating a custom version of the corporate file sharing webapps for internal use. This would solve the problems we were having. All the links would be hosted on our domain, so we wouldn’t have to worry about getting third parties’ domains whitelisted in other company’s firewalls.

            I decided that WordPress would be the best tool to build this on. It already has wonderful custom post management abilities as well as built-in media management tools.

            I’m proud of what I built, so I got permission to release it to the WordPress community as a white-labeled plugin. Special thanks to @binmind for his extensive QA testing of the company’s plugin, his testing was crucial for development of the proof of concept and making sure everything was working as it should.

            Instead of releasing the plugin as-is,  I decided to rebuild it from scratch. I’ve learnt a lot since building the original assets manager  and wanted to harden up the code base before releasing it to the public. Here are the results of my efforts.

            Features

            features

            Path Obfuscation:

            When a file is uploaded to WordPress you usually access it by linking directly to the location of where the file is hosted on the server. Assets Manager creates a unique obfuscated link for the file instead. When a file is downloaded it will receive the name you supply.

            This does two things:

            1. You can’t figure out where the file is actually hosted, nor can you find other files based on some pattern. This is a security feature. Since the links to the files do not indicate anything about where the files are, or what they will be called when downloaded, you can’t guess where other files are stored.
            2. Files are never linked to, they are read and served. This allows #1 to work. It also means that before the file is served, Assets Manager can check various things, like if the user is logged in or if the file has “expired”.

            When should this file expire?

            Because of #2 above, Assets Manager intercepts files before they are served to the user from the server. This means that you can decide when and how the file will be served. I’ve included the ability to set how long the file should last. If you see you’re running out of time, you can extend the expiration by as long as you wish. The expiration date of the file is displayed next to the expiration feature letting you know when the file will expire.

            Enable this file?

            Same as the above feature. If you send out the wrong link, you can easily edit the settings and uncheck “Enabled”.

            Secure this file?

            I can also  check to see if a user is logged in before serving them the file. It doesn’t actually make the file secure. If someone downloads it, they can send it anywhere. It only secures the link to the file.

            Remove file

            When a file is removed it is not deleted, it can still be found in the media library. It is just detached from that assets set. You can delete it via the media library if you wish.

            Stats

            A basic hit count is recorded per file.

            Asset Set

            Each asset set is a custom post type, the upload files are attached to this post. The URL for the asset set is obfuscated to protect it’s location. If it is linked to it will be indexed though. But bots can find it crawling the site.

            You can upload a set of files, then only share the one link. That way if you decide to change the links around you can. Only available files will be listed there. So if a file is “secure” and the user isn’t logged in, they won’t see it, nor will anyone see expired and disabled files.

            Future features I’m working on:

            • Sha1: If you upload a file that already exists it will link that file to your post instead of keeping multiple versions of the file. I believe that WordPress should work this way in general, all filesystems for that matter. That’s a benefit of networks. Why keep doubles, unless you intentionally are backing up the information?
            • File replacement: After uploading and even sharing a file you’ll be able to replace the file behind the active link with a file of the same MIME type. This way if you make a typo you can fix it quickly and replace the file without sending out a new link.
            What do you think?
             If you have ideas, discover bugs, let me know.

              How WordPress Works: Dissecting the Database

              The WordPress Database

              There is beauty in the simplicity of WordPress’ database structure. All the functionality of posts, pages, custom posts, taxonomy, users and core settings are here. In 11 tables.

              For comparison, the almighty Drupal has 72 tables, Joomla has 68.

              All posts, pages and custom posts are saved in the `wp_posts` table. They are differentiated by the `post_type` column. Any additional data you need to save with your post (whatever the post_type is) can be stored in `wp_postmeta`.

              Metas are extremely powerful. You can extend everything in pretty much any way with them.

              Example: Your site manages the courses of an educational institute. So you create the post_types of ‘Course’ and ‘Lecturer’. Now you can save in the `post_content` all about the ‘Course’ and ‘Lecturer’, but what if you need to store extra information about each, that you’ll need to access easily. For a course you might want to know the dates the course is taking place. If you save that in the ‘post_content’, as part of the other descriptive content, you will not be able to run queries easily on that information, you can’t sort it, pull it out for widgets etc. That’s where meta comes in.

              wp_postmeta table

              Each of the meta tables, postmeta, commentmeta and usermeta each have 4 columns: meta_id, post_id (or the equivalent), meta_key, and meta_value. Each post can have whatever extra meta you need, and it can be pulled out with a simple SELECT WHERE meta_key = ‘X’; command.

              And that’s pretty much it. All of WordPress’s functionality is there. Comments, users, and posts all have their basic structure in their main table and all can be extended as much as needed through their meta.

              Taxonomy is somewhat more complicated. It requires 3 tables. wp_term_taxonomy stores the types of taxonomies. Categories, Tags, and any other custom taxonomy type you create will be here. The individual terms will be in wp_terms. So if you have 3 categories and 15 tags in your site, each of those will be stored in wp_terms. wp_term_relationships links them all together keeping it all in order. Easy-peasy, right?

              The basic options of the WordPress install are in wp_options. The only table out of order is wp_links, a relic of installs past. Today all the link functionality can easily be incorporated as a custom_post_type. But because WordPress cares about backwards compatibility, the table remains.

              That’s it. Lean and mean.

              One question that comes up about meta is, doesn’t that mean that there are a lot of extra queries hitting the database? This would be true, if not for the caching system of WordPress. So each time you call get_post_meta() you’re not hitting the database. So you’re good.

              So when people say that WordPress is “bloated” I’m not quite sure what they’re talking about.


                How I Optimized My LAMP Server

                I recently switched servers for this site. I moved from Media Temple to Digital Ocean. Think of Digital Ocean as AWS but faster, cheaper, and with great UX. I’ve been meaning to move there for a while, ever since I figured out how to manage my own LAMP stack.

                One benefit of Digital Ocean is their fantastic documentation. So there isn’t much to figure out… But for someone who came from Front-end Development, it’s a bit intimidating to manage your own server. To tell you the truth, I’ve tried this move a few times, but the last time I set up a stack for this site I used SUSE Linux (I don’t know what I was thinking), and the site kept crashing.

                Since then I’ve played with VMWare and got comfortable with setting up my own development server, and moved to CentOS.

                The missing link was optimizing Apache.

                I’m a big fan of This Week in Startups and one of their sponsors is New Relic. If they say something is worth trying, I try it.

                After switching to Digital Ocean I set up New Relic on the new site. Even though I had installed W3 Total Cache on my install, New Relic was still giving me error warnings every 10-15 minutes. Frustrating! True, I AM running a WordPress multisite on the lowest tier, but none of the sites are high traffic. I should be able to do that.

                Well, after digging into New Relics errors I saw that I was using 100% of my my physical memory and 200% of my swap memory. BAD.

                Then I found Jean-Sebastien Morisset’s check_httpd_limits.pl. WOW.

                I updated my httpd.conf with his recommendations and look at the results:

                Physical Memory - New Relic DashboardYou can clearly see when the new settings took effect.

                Here’s the site’s load average:

                Load Average - New Relic Dashboard

                Best part is, since these settings took effect, NO MORE ERROR WARNINGS FROM NEW RELIC!!!

                So, if you read this Jean-Sebastien, thanks for your wonderful tool! And New Relic, thank YOU for your excellent monitoring that pushed me to do this!