Category Archives: Web Development

I come across random techniques while working and decided to start blogging them… For me to find later, for others who may need them and with the hope to get some feedback.

Leveling Up Your Development Skills with a Pinch of SysAdmin

 “If you’re developing properly, you shouldn’t have to worry about whether your site is running Apache, Nginx or anything else. Your code should just work.”
- anon

When developing a plugin or theme for use by the greater community this is certainly true. Anything you put out there for someone else, should absolutely not be dependent on the platform.

However, if you already have a platform running, and your code is serving a purpose other than to be code for other people (a plugin or theme) to use on their sites — like if you are running a service yourself — then you want to make sure your code will run on your production server before pushing it there. That’s why I keep my servers running the same infrastructure. I have people who rely on my site running as expected.

I have a Virtual Machine (VM) running on which I develop. VMs are great because you can control 100% what is running on it.

Developing on a Mac is fun because it’s core is Unix-esk. Many developers will just use the PHP that is right there, or use MAMP. MAMP is a very good way to get started.

A native setup like that works 99% percent of the time. But I found that in a few edge cases, if you’re not developing in the same environment that your production server runs on, it can make debugging complicated… and I HATE debugging my production server live. If you do that you might as well skip all other layers and code commando, straight on production.

Another benefit of running your own VM is that you learn what goes on under the hood. Sometimes your code isn’t the only thing responsible for speed and performance. If you’re on a shared host, there’s a lot you can’t do. Once you practice on a local development environment, you might just find that you’ve built up enough gumption to run your own live server yourself.

The lowest tier on Digital Ocean is certainly comparable in price to any shared hosting. The benefits are nice, though. Want to try out Nginx instead of Apache? Sure! Want to use fast-cgi instead of running PHP on top of Apache? Go for it? How about just a simple APC install? No arguing on the phone with customer support. Just do it!

Doing this IS scary. The buck stops with YOU. Make sure you have proper fail safes, backups, etc. Digital Ocean has daily backups, which is nice. But if you’re hacked or you get a Reddit bump you need to handle it yourself.

Doing this also means that you will be competent to spin up multiple environments for testing. Thus validating the quote at the beginning of the post.

So how do I do that?

  • On my Mac I run VM Fusion. I found it far superior for this than Parallels for running a local server.
  • Digital Ocean has wonderful tutorials for spinning up servers. Try the LAMP stack. Want to run Nginx? No problem. I recommend trying several of these, a few times. The cool thing with a VM is that you can delete it and begin again, as many times as you need. Get to a good place? Take a snapshot and roll on.
  • I Deploy with git. Which basically consists of making sure your local server can SSH to your live server(s). Since most of the people who use the sites I manage, manage the content themselves I don’t have to worry about syncing databases, so I haven’t worked out a solution for that.
  • When I work with a team on a project I’ll typically have a staging server. It’s basically a clone of production, only without an easily accessible url. We coordinate with a central repository and test on the staging server. When code is ready to be shipped. It’s pushed to production.

When I keep everything the same, I won’t have to worry about deploying. If it works locally, and it works in the staging area, I can be assured that it’ll work on production.

I’d love to hear your thoughts. Discuss:

 


    Handling a PHP unserialize offset error… and why it happens

    I  discovered recently the importance of proper collation of database tables. I inherited a proprietary CMS to manage. The default collation was latin1_swedish_ci. Apparently it’s because “The bloke who wrote it was co-head of a Swedish company“. The problem occurred when a form we had on our site began getting submissions with foreign characters. The database collation couldn’t accept the characters and was saving them as question marks (?).

    Serialization is the process of translating data structures or object state into a format that can be stored.” For example the array:

    $returnValue = serialize(array('hello', 'world'));

    Will become:

    a:2:{i:0;s:5:"hello";i:1;s:5:"world";}

    This is what the above string means:

    • There is an array that is 2 in length. a:2.
    • The first item in the array has a key that is an integer with the value of 0. i:0.
    • The value for that item is a string that is 5 characters long, which is “hello”. s:5.
    • The second item in the array has a key that is an integer with the value of 1. i:1.
    • The value for that item is a string that is 5 characters long, which is “world”. s:5.

    An unserialize offset error can occur when the string count in the serialized data does not match the length of the string being saved. so in the above example that would look like this:

    a:2:{i:0;s:4:"hello";i:1;s:5:"world";}

    Notice the number ’4′, while there are really 5 characters in the world ‘hello’.

    So the question is, why would the offset happen when a ? replaces a foreign character?

    To understand why, you need to dig into how UTF-8 works and things will become clear.

    The UTF-8 value of ‘?’ is ’3f’, while the value for ‘Æ’ is ‘c3 86′. '?' translates into s:1:"?"; while 'Æ' translates into s:2:"Æ";. Notice the 2 replacing the 1 in the string length. So basically, what’s happening is that when php serializes the data it is storing the foreign character as a double the length but when it’s passed to MySQL, when the table isn’t formatted for UTF-8, the database converts the character to a ?, which is then stored as a single character. But the serialization length is not updated, so when you go and unserialize the data there is an offset error.

    How to resolve the problem

    There are several articles that provide solutions. The most popular is to use the base64_encode() function around the serialized data. This will prevent the data from getting corrupted since base64 converts the data to ASCII which any collation can take.

    //to safely serialize
    $safe_string_to_store = base64_encode(serialize($multidimensional_array));
    
    //to unserialize...
    $array_restored_from_db = unserialize(base64_decode($encoded_serialized_string));

    If you don’t have access to your database, or don’t want to fool with it, this is a great solution. You can also set your table collation to utf8_general_ci or utf8_unicode_ci and that should solve your problem as well (that’s what we did).

    But what if you already have bad data in your database, like we had, and you’re getting the horrid ‘Notice: unserialize() [function.unserialize]: Error at Offset’ error. When you get this notice, chances are you’re not getting all your data either…

    Here’s what you do:

    $fixed_serialized_data = preg_replace_callback ( '!s:(\d+):"(.*?)";!',
        function($match) {
            return ($match[1] == strlen($match[2])) ? $match[0] : 's:' . strlen($match[2]) . ':"' . $match[2] . '";';
        },
    $error_serialized_data );
    

    This will search out the strings, recount the length, and replace the string length with the correct value. Unfortunately it cannot recover what the original foreign character was, but at least the rest of your data will load.

    I got the original code from StackOverflow, but since PHP 5.5 the /e modifier in preg_replace() has been deprecated completely and the original preg_match statement suggested will error out. So I rewrote it with preg_replace_callback().


      Using a post-receive Git hook to mark a deployment in NewRelic

      I recently started monitoring my systems with NewRelic. Fantastic tool.

      One fun feature they provide is that you can mark in NewRelic’s dashboard when you’ve deployed new code. This way you can compare your site performance before and after the deploy.

      curl -H "x-api-key:YOUR_API_KEY_HERE" -d "deployment[app_name]=iMyFace.ly Production" -d "deployment[description]=This deployment was sent using curl" -d "deployment[changelog]=many hands make light work" -d "deployment[user]=Joe User" https://api.newrelic.com/deployments.xml

      Using Git’s post-receive hook is perfect for this, especially since I already use it to deploy my sites to the various servers.

      The only question I had was, how would I get the various variables from the post-receive hook into the curl statement?

      Well, here you go:

      description=$(git log -1 --pretty=format:%s)
      author=$(git log -1 --pretty=format:%cn)
      revision=$(git log -1 --pretty=format:%T)

      Now you can do this:

      curl -H "x-api-key:YOUR_API_KEY_HERE" -d "deployment[app_name]=iMyFace.ly Production" -d "deployment[description]=$description" -d "deployment[user]=$author" -d"deployment[revision]=$revision" https://api.newrelic.com/deployments.xml

        Introducing Assets Manager for WordPress

        Note: if the links aren’t working properly, resave the pretty permalinks settings.

        Download

        Many of the companies which my current place of employment interacts with have a higher level of security on their firewall (they also tend to use IE7, such is life). Because of this we were having issues sharing files with our constituents using the current industry file sharing tools.

        To solve this problem I was tasked with creating a custom version of the corporate file sharing webapps for internal use. This would solve the problems we were having. All the links would be hosted on our domain, so we wouldn’t have to worry about getting third parties’ domains whitelisted in other company’s firewalls.

        I decided that WordPress would be the best tool to build this on. It already has wonderful custom post management abilities as well as built-in media management tools.

        I’m proud of what I built, so I got permission to release it to the WordPress community as a white-labeled plugin. Special thanks to @binmind for his extensive QA testing of the company’s plugin, his testing was crucial for development of the proof of concept and making sure everything was working as it should.

        Instead of releasing the plugin as-is,  I decided to rebuild it from scratch. I’ve learnt a lot since building the original assets manager  and wanted to harden up the code base before releasing it to the public. Here are the results of my efforts.

        Features

        features

        Path Obfuscation:

        When a file is uploaded to WordPress you usually access it by linking directly to the location of where the file is hosted on the server. Assets Manager creates a unique obfuscated link for the file instead. When a file is downloaded it will receive the name you supply.

        This does two things:

        1. You can’t figure out where the file is actually hosted, nor can you find other files based on some pattern. This is a security feature. Since the links to the files do not indicate anything about where the files are, or what they will be called when downloaded, you can’t guess where other files are stored.
        2. Files are never linked to, they are read and served. This allows #1 to work. It also means that before the file is served, Assets Manager can check various things, like if the user is logged in or if the file has “expired”.

        When should this file expire?

        Because of #2 above, Assets Manager intercepts files before they are served to the user from the server. This means that you can decide when and how the file will be served. I’ve included the ability to set how long the file should last. If you see you’re running out of time, you can extend the expiration by as long as you wish. The expiration date of the file is displayed next to the expiration feature letting you know when the file will expire.

        Enable this file?

        Same as the above feature. If you send out the wrong link, you can easily edit the settings and uncheck “Enabled”.

        Secure this file?

        I can also  check to see if a user is logged in before serving them the file. It doesn’t actually make the file secure. If someone downloads it, they can send it anywhere. It only secures the link to the file.

        Remove file

        When a file is removed it is not deleted, it can still be found in the media library. It is just detached from that assets set. You can delete it via the media library if you wish.

        Stats

        A basic hit count is recorded per file.

        Asset Set

        Each asset set is a custom post type, the upload files are attached to this post. The URL for the asset set is obfuscated to protect it’s location. If it is linked to it will be indexed though. But bots can find it crawling the site.

        You can upload a set of files, then only share the one link. That way if you decide to change the links around you can. Only available files will be listed there. So if a file is “secure” and the user isn’t logged in, they won’t see it, nor will anyone see expired and disabled files.

        Future features I’m working on:

        • Sha1: If you upload a file that already exists it will link that file to your post instead of keeping multiple versions of the file. I believe that WordPress should work this way in general, all filesystems for that matter. That’s a benefit of networks. Why keep doubles, unless you intentionally are backing up the information?
        • File replacement: After uploading and even sharing a file you’ll be able to replace the file behind the active link with a file of the same MIME type. This way if you make a typo you can fix it quickly and replace the file without sending out a new link.
        What do you think?
         If you have ideas, discover bugs, let me know.

          How WordPress Works: Dissecting the Database

          The WordPress Database

          There is beauty in the simplicity of WordPress’ database structure. All the functionality of posts, pages, custom posts, taxonomy, users and core settings are here. In 11 tables.

          For comparison, the almighty Drupal has 72 tables, Joomla has 68.

          All posts, pages and custom posts are saved in the `wp_posts` table. They are differentiated by the `post_type` column. Any additional data you need to save with your post (whatever the post_type is) can be stored in `wp_postmeta`.

          Metas are extremely powerful. You can extend everything in pretty much any way with them.

          Example: Your site manages the courses of an educational institute. So you create the post_types of ‘Course’ and ‘Lecturer’. Now you can save in the `post_content` all about the ‘Course’ and ‘Lecturer’, but what if you need to store extra information about each, that you’ll need to access easily. For a course you might want to know the dates the course is taking place. If you save that in the ‘post_content’, as part of the other descriptive content, you will not be able to run queries easily on that information, you can’t sort it, pull it out for widgets etc. That’s where meta comes in.

          wp_postmeta table

          Each of the meta tables, postmeta, commentmeta and usermeta each have 4 columns: meta_id, post_id (or the equivalent), meta_key, and meta_value. Each post can have whatever extra meta you need, and it can be pulled out with a simple SELECT WHERE meta_key = ‘X’; command.

          And that’s pretty much it. All of WordPress’s functionality is there. Comments, users, and posts all have their basic structure in their main table and all can be extended as much as needed through their meta.

          Taxonomy is somewhat more complicated. It requires 3 tables. wp_term_taxonomy stores the types of taxonomies. Categories, Tags, and any other custom taxonomy type you create will be here. The individual terms will be in wp_terms. So if you have 3 categories and 15 tags in your site, each of those will be stored in wp_terms. wp_term_relationships links them all together keeping it all in order. Easy-peasy, right?

          The basic options of the WordPress install are in wp_options. The only table out of order is wp_links, a relic of installs past. Today all the link functionality can easily be incorporated as a custom_post_type. But because WordPress cares about backwards compatibility, the table remains.

          That’s it. Lean and mean.

          One question that comes up about meta is, doesn’t that mean that there are a lot of extra queries hitting the database? This would be true, if not for the caching system of WordPress. So each time you call get_post_meta() you’re not hitting the database. So you’re good.

          So when people say that WordPress is “bloated” I’m not quite sure what they’re talking about.


            How I Optimized My LAMP Server

            I recently switched servers for this site. I moved from Media Temple to Digital Ocean. Think of Digital Ocean as AWS but faster, cheaper, and with great UX. I’ve been meaning to move there for a while, ever since I figured out how to manage my own LAMP stack.

            One benefit of Digital Ocean is their fantastic documentation. So there isn’t much to figure out… But for someone who came from Front-end Development, it’s a bit intimidating to manage your own server. To tell you the truth, I’ve tried this move a few times, but the last time I set up a stack for this site I used SUSE Linux (I don’t know what I was thinking), and the site kept crashing.

            Since then I’ve played with VMWare and got comfortable with setting up my own development server, and moved to CentOS.

            The missing link was optimizing Apache.

            I’m a big fan of This Week in Startups and one of their sponsors is New Relic. If they say something is worth trying, I try it.

            After switching to Digital Ocean I set up New Relic on the new site. Even though I had installed W3 Total Cache on my install, New Relic was still giving me error warnings every 10-15 minutes. Frustrating! True, I AM running a WordPress multisite on the lowest tier, but none of the sites are high traffic. I should be able to do that.

            Well, after digging into New Relics errors I saw that I was using 100% of my my physical memory and 200% of my swap memory. BAD.

            Then I found Jean-Sebastien Morisset’s check_httpd_limits.pl. WOW.

            I updated my httpd.conf with his recommendations and look at the results:

            Physical Memory - New Relic DashboardYou can clearly see when the new settings took effect.

            Here’s the site’s load average:

            Load Average - New Relic Dashboard

            Best part is, since these settings took effect, NO MORE ERROR WARNINGS FROM NEW RELIC!!!

            So, if you read this Jean-Sebastien, thanks for your wonderful tool! And New Relic, thank YOU for your excellent monitoring that pushed me to do this!


              Using AJAX in WordPress Development. The Quick-and-Dirty QuickStart Guide

              There are some great posts and a fantastic wiki page explaining how to use AJAX in WordPress. But I haven’t found a quick plug-and-play tutorial. So here goes…

              The problem: A simple form that will give the visitor an input and when they click “Next” it will send the content of the input to the server who will send all the $_POST fields back as JSON. Why you’d want this? Who knows. But it’s a simple problem to solve that you can adopt to do anything.

              Here’s the Gist

              This is the plug-and-play version my friends. (Extra points if you recognize what ui framework is here… DON’T JUDGE ME IT’S ONLY FOR WIREFRAMING.)

              How to use the code

              • Add include_once('inputtitle_submit_inc.php'); in functions.php. Make sure inputtitle_submit_inc.php in in your template folder.
              • page-ajax_input.php is a template page, make sure it’s in in your template folder. Just create a page in WordPress using “Input Submition Page”.
              • inputtitle_submit.js should be in a folder named ‘js’ in your template folder. Otherwise

              wp_enqueue_script( 'inputtitle_submit', get_template_directory_uri() . '/js/inputtitle_submit.js', array( 'jquery' ));

              will fail.

              How it works

              page-ajax_input.php

              This is a simple template file. The important elements here are the input field and the next button. They are hooked in the JS file.

              inputtitle_submit_inc.php

              The server-side magic.

              The first line enqueues the js file and pops some variables in for the AJAX onto the page. They are called in inputtitle_submit_scripts().

              The next two lines enable the AJAX to work. They create the ajax action “ajax-inputtitleSubmit”. If you only have “wp_ajax_ajax-inputtitleSubmit” it will only work for logged in users. If you only have “wp_ajax_nopriv_ajax-inputtitleSubmit” it will only work for logged out users. If you do this, make sure you have serious security in place.

              Those two lines tie the action to myajax_inputtitleSubmit_func(). This is what happens server side. Inside you’ll find some nonce magic for security. The function checks the nonce, then converts the $_POST variables to JSON and sends them back to the browser. Don’t forget the exit();

              inputtitle_submit.js

              The Javascript.

              First I encapsulate the JQuery so that it won’t conflict with anything. Then when the DOM is ready…

              When “Next” is clicked it sends a POST AJAX request to the server. The AJAX URL was defined in wp_localize_script in inputtitle_submit_inc.php as well as the nonce.

              We send the action, the nonce and the inputted “title” as variables to the server. Then in outputs the response (all $_POST variables as JSON) in the console.

              Summary

              I built this for reference sake. If you can suggest any best practices or improvements please comment below.


                WordPress postmeta is useful, but be careful

                The add_post_meta, delete_post_meta, update_post_meta, and get_post_meta functions are really useful. It’s the perfect place to store information about a post. Many plugins take advantage of this storage for determining whether a specific post/page needs the feature they are providing or not.

                Example: I recently installed on a site I manage the WordPress HTTPS plugin; it allows you to force SSL on a specific page or post of your site.

                Once enabling the plugin on a page on the site I checked the “Custom Fields” section (where the postmeta fields are displayed on the post edit page) and lo and behold:

                Screen Shot 2013-02-03 at 5.27.49 PM

                A new postmeta field had been added.

                Not surprising, as I said, it’s a useful place to store information. But there is one aspect of this feature you should be aware of: it is cached on page load.

                When you run the WordPress loop many wonderful things happen to make your page load as efficiently as possible. One of those things is that WordPress caches all the postmeta values when it loads the post.

                This means two things:

                1. You DON’T have to worry about the amount of times you call the same value through get_post_meta(), since your server is not making a new query for each function call.

                2. You DO have to worry about how much information you are storing in the postmeta since all that information will be loaded into server memory each time the post or page is loaded. Normal storage will work fine, store things like settings, variables and content that is needed for displaying the post. But don’t think about the postmeta as a place for unlimited storage. Some things do need their own table.

                What do I mean?
                In short, don’t store post logs or large amounts of stats and data there.

                Example: I made a file uploading plugin for a client to be used internally in their company, that leverages plupload built into WordPress. I tied the backend into the company’s LDAP server so that any org member could sign into the uplaoder and not need an account created. Each file uploaded was tied to the user’s account so that they could each manage their own files. There’s a few more useful features thrown in like: file expirations, secured files, and dynamic file serving. I’ll be happy to post specs at some point. It’s pretty cool.

                One feature I added was logging file access. So that when each file is accessed there is a trace of who/what/where/when. I thought: “what better place to store that information then in the postmeta?” Right? NOPE. The site ran smoothly until images uploaded were used in an email blast. The blast only went out to a few thousand people, but each time any of those images were loaded i.e. each email opened, the ENTIRE access log was loaded into the memory.

                Oops.

                'update_post_meta_cache' => false
                

                Was the quick fix, and gave me time to offload the logs and refactor the code…

                For more information about the power of the Loop I highly recommend watching Andrew Nacin’s talk about WP_Query, talk slides.