Getting the gist of segment trees

Segment Trees

Wow they seem overly complex to learn about. They’re a very easy concept, but there is a paucity of resources for explaining them.

Have a gander at the resources at the bottom for the ones I came across. Took me ages to figure out the differences between them.

Here, I assume you know about Binary Trees and recursion on Binary Trees.

Any code is in python which is about as close to pseudo code as you can get.

tl;dr A segmented tree in one diagram

Structuring an array (length 8) in a segmented tree. The indexes of the array are the leaf nodes.

The parent nodes here are sums of the children (but could be another function).

This can be coded as a normal binary tree.

       28
     /   \
   6       22
  / \     / \
 1   5   9   13
/ \ / \ / \ / \
0 1 2 3 4 5 6 7

Step 1: Create a tree with specific indexing strategy

We are going to use an example for storing 2^3 = 8 values indexed by [0,7] from an array in a segment tree.

We’re using 8 to ensure the segment tree we will create is a “perfect” binary tree to make things easier, or at least prettier. All segmented trees are ‘full‘ – all nodes have 0 or 2 children.

The values will get added as leaves of the tree.

So we want a tree that has 8 leaves.

To learn about how we construct them we’ll first number the nodes of a tree in depth wise (across the levels).

Some (Geeks for Geeks) start this at 0. League of Programmers start this at 1.

A perfect binary tree with 8 leaves has 4 levels and 2^4 – 1 = 15 nodes. The eight leaves are the values so we will use the 15 – 8 = 7 nodes for the segment nodes.

A binary tree with 7 nodes indexed by level, starting at 1 and with the 8 values of our array as the leaves:

       1 <= parent
     /   \
   2       3 <= i
  / \     / \
 4   5   6   7 <= children
/ \ / \ / \ / \
0 1 2 3 4 5 6 7

For each node i that we have indexed, e.g node i = 3, then

     parent = i/2 = 1 (integer division)
 left child = 2*i = 6
right child = 2*i+1 = 7

Alternatively, If you index the nodes from 0:

       0 <= parent
     /   \
   1       2 <= i
  / \     / \
 3   4   5   6 <= children
/ \ / \ / \ / \
0 1 2 3 4 5 6 7

Here for i = 2:

parent = (i-1)/2 (integer division)
  left = 2*i + 1
 right = 2*i + 2

The difference between starting from 0 or 1 confused me a lot when I was looking at the various tutorials. I’m going to stick with indexing from 1.

Step 2: Define what the segments are for each of the nodes.

Reminder about ranges:
() = ‘open’ boundary, [] = ‘closed’ boundary. We’re only using closed boundaries here.

x in (a,b] => a < x <= b
x in [a,b) => a <= x < b
N.B. [a,a] = a

The root node i = 1 refers to the whole array range [0,7]. Then node i = 2 refers to the lower half [0,3] and node i = 3 refers to the upper half [4,7]

N.B. [0,7] is the range 0-7 but it can also refer to a python list with 0 and 7 as values. Hopefully the context should make it obvious.

So, so far we have:

       1
     [0,7]
     /   \
   2       3
 [0,3]   [4,7]

Then replacing all indexes of the tree with their corresponding segments:

         [0,7]
        /     \
   [0,3]       [4,7]
   /   \       /   \
[0,1] [2,3]  [4,5] [6,7]
 / \   / \    / \   / \
 0 1   2 3    4 5   6 7

Step 3: Create a function to store the segment of values

We want to set the value of the parent nodes to indicate the range of values that are in the leaves of the current sub-tree. This is quite well explained on Slide 16 of the League of Programmers’ slides:

“Each internal node represents some merging of the leaf nodes”

There are four common ones: sum, min, max, GCD (Greatest Common Divisor).

In more general terms, for node i = 3, node value = f(a[4], a[5], a[6], a[7]) for the array a.

However this function f can be anything, e.g. other possibilities are count or finding the mid point of the range ((max - min) / 2)

Min is used for a the the Range Minimum Query, Geeks for Geeks have a Coded Example.

Example using sum:

       28 (= 6+22)
     /   \
   6       22 (= 9+13)
  / \     / \
 1   5   9   13 (= 6+7)
/ \ / \ / \ / \
0 1 2 3 4 5 6 7

This corresponds to the similar tree in the Geeks for Geeks tutorial.

Step 4: Worked example to ‘fill in the blanks’

We’re going to use count as the function. We want to fill in the ith blank position (starting from 1).

We are given an array of [3,2,1,1] for which blank to fill for the following sorted values [1,2,3,4].

    blanks = [3,2,1,1]
    #         |
    #         ^
    values = [1,2,3,4]

So 1 wants to go into the 3rd blank space. Then 2 wants to go into the 2nd blank space of the remaining 3 spaces.

Step 1: [_,_,1,_] <= 1 goes in 3rd blank space
Step 2: [_,2,1,_] <= 2 goes in 2nd blank space remaining
Step 3: [3,2,1,_] <= 3 goes in 1st blank space remaining
Step 4: [3,2,1,4] <= 4 goes in 1st blank space remaining

       0 <= count of values in segment [1,4]
     /   \
   0       0
  / \     / \
 1   2   3   4
 _   _   _   _

Fill in 1:

       1* <= count increased by 1
     /   \
   0       1*  <= count increased by 1
  / \     / \
 1   2   3   4
 _   _   1*  _ <= 1 placed in 3rd blank

Fill in 2:

       2*
     /   \
   1*      1
  / \     / \
 1   2   3   4
 _   2*  1   _

Fill in 3:

       3*
     /   \
   2*      1
  / \     / \
 1   2   3   4
 3*  2   1   _

Fill in 4:

       4*
     /   \
   2      2*
  / \     / \
 1   2   3   4
 3   2   1   4*

Step 5: Code for ‘fill in the blanks’

We’re assuming that the tree has been created for us. The code for this is at the end.

To programmatically decide which blank to fill, we need to know if the ith value is in the left or right segment. Here’s the starting point again.

Initially, the counts are all zero. So for the root node we want to know if 3 is greater than or less than 4/2 (half the length of the array). If it’s equal, i.e. 2, then we want to put it in the left segment. Also if it’s less than i.e. 1 put it in the left half. If its greater than i.e. [3,4] then it wants to go in the right segment.

    N = len(segment)
    if N/2 < i:
        # go right
    else: # i <= N/2
        # go left
       0 <= looking for 3rd leaf node, i = 3 > N/2 so look right
     /   \
   0       0
  / \     / \
 1   2   3   4
 _   _   _   _

At the second level, the count is still zero. But now we are only considering half the values. so N = 2, but now we want to know if the initial i = 3 goes in the 1st or 2nd segment. We’ve discarded half the values so we can subtract half the original N = 4 values from i. So at the first level of recursion, we alter i for the next level to be, i = i - N/2. Going left requires no change.

The length of the segments, if even, will divide in two, so N = N/2. If odd, then N/2 (using integer division) go left and the rest (N - N/2) go right so we split range [1,N] into [1,N/2] and [N/2+1,N]. So [1,3] => 1 and [2,3]

       0
     /   \
   0       0 <= i = 1, N = 2, i <= N/2 so look left
  / \     / \
 1   2   3   4
 _   _   _   _
    def fillBlank(self, segment, i, N):
        # N = len(segment) # we pass this in
        if i <= N/2:
            # go left
            N = N/2
            self.fillBlank(segment.left, i, N)
        else: # N/2 < i
            # go right
            i = i - N/2
            N = N - N/2
            self.fillBlank(segment.right, i, N)

Then when we hit the leaf then we can set the value.

    # N.B. the new params blanks and value
    # blanks is passed in by reference
    # value is going to fill the blank
    def fillBlank(self, segment, i, N, blanks, value):
        if not segment.left and not segment.right:
            # N == 1
            # we're dealing with array indexes, so minus 1
            blanks[segment.value - 1] = value
            return
        # ...
       0
     /   \
   0       0 <= i = 1, N = 2, i <= N/2 so look left
  / \     / \
 1   2   3   4
 _   _   1   _ <= blanks[2] = 1

Now, to complete our first pass through the tree, we should have updated the counts (the values of each of the nodes) by one as we passed through it. So the full method becomes:

    def fillBlank(self, segment, i, N, blanks, value):
        if not segment.left and not segment.right:
            # N == 1
            # we're dealing with array indexes, so minus 1
            blanks[segment.value - 1] = value
            return
        # increment the count of this segment
        segment.value += 1
        # N = len(segment) # we pass this in
        if i <= N/2:
            # go left
            N = N/2
            self.fillBlank(segment.left, i, N, blanks, value)
        else: # N/2 < i
            # go right
            i = i - N/2
            N = N - N/2
            self.fillBlank(segment.right, i, N, blanks, value)

However this doesn’t quite work for all the following passes through the tree, e.g.searching for the 1st blank space for the very final step – which wants to go into the blank space which is at the end of the array.

To do this we take into account the count of the child segments. If the left segment has been filled up but the right segment is empty, then we automatically want to look in the right segment.

So the condition i <= N/2 becomes i <= N/2 - X where X = segment.left.value and when going right, we alter the ith element similarly i = i - (N/2 - X).

Finally we don’t care about this conditional check when the segment only has two values in it as those two values are just the leaves (which don’t contain counts). This works since all segmented trees are ‘full’ (every node has 0 or 2 children).

    def fillBlank(self, segment, i, N, blanks, value):
        if not segment.left and not segment.right:
            # N == 1
            # we're dealing with array indexes, so minus 1
            blanks[segment.value - 1] = value
            return
        # increment the count of this segment
        segment.value += 1
        # N = len(segment) # we pass this in
        # this section gets more complicated from edge cases
        if N > 3:
            X = segment.left.val
        elif N == 3:
            # left node is leaf
            X = segment.val - segment.right.val
        elif N == 2 and blanks[segment.left.val - 1]:
            # need to check if left has been filled
            X = 1
        else:
            X = 0

        if i <= N/2 - X: # X subtracted
            # go left
            N = N/2
            self.fillBlank(segment.left, i, N, blanks, value)
        else: # N/2 < i
            # go right
            i = i - (N/2 - X)
            N = N - N/2
            self.fillBlank(segment.right, i, N, blanks, value)

Here’s the first step of the second pass:

       1 <= looking for 2nd leaf node, X=0, N=4, i=2 <= N/2 so look left
     /   \
   0       1
  / \     / \
 1   2   3   4
 _   _   1   _

We then follow the recursion through, which should populate the blanks list.

Resources

view raw
segment-trees.md
hosted with ❤ by GitHub

Translating WordPress plugins

If you didn’t write the plugin – how do you translate it? I found lots of information for plugin developers to make their plugins translatable but not for people that just want to translate it.

Executive summary / tl;dr

Edit: The Codestyling Localization plugin I suggested previously isn’t available from WordPress directly any more.

Instead there seems to be an equally good plugin Loco Translate that is complemented by the developers own site for translating any application.

Failing that try the ‘Blank WordPress POT’ with POEdit and put that in the plugin’s languages directory. This seems to be a rock solid approach but takes a bit of learning.

If you put your translations in the plugins language directory, you’ll lose your translations when the plugin gets updated. Either keep backups or, if the plugin loads the translations correctly, you can put your translations in the separate wp-content/languages/[plugin-name] directory.

For worker ants who want to know a bit more

  1. The plugin must be translatable (It has been internationalized)
  2. You can check this by looking through the plugin code
  3. Otherwise just carry on and hope that the developer made it translatable
  4. Then you need to create a translation file that converts the English words into your language (localize it)
  5. A POT (.pot) file is a skeleton file that contains all the words and phrases in the plugin that can be translated
  6. A .po file is a .pot file that has had the translated strings in your language added next to each of the translatable phrases
  7. When you save a .po file it also generates a .mo file (where as you can read a .po file in notepad you can’t read a .mo file)
  8. WordPress plugins rely on .mo files
  9. You first need to find the .pot file
  10. Typically this will be in the root of the plugin directory
  11. Or it will be in a languages directory
  12. If you find it, great. Open it in POEdit, do your stuff and save
  13. If it hasn’t been created at all …
  14. … Then you need to check in the code (look for load_plugin_textdomain in the main plugin file) to see where/if you should create one and create the .pot file yourself

If you don’t know about editing PO files I suggest following the WordPress instructions on using POEdit.

For developers to ‘internationalize’ their plugins

I recommend using the Blank WordPress POT file that I mention below in the POEdit revisited section.

Translation fumbling

POEdit

Initially I tried the POEdit method recommended in WPMU DEV tutorial

However the POT files that I made never worked – they didn’t pick up the translated words. I think it was because of __() and _e() instead of __ and _e.

makepot.php

  • WordPress
    and Tom McFarlin talk about using makepot.php but its confusing if you’re not using Linux
  • It requires using the wp-plugin as the project name which I assumed that you changed to the name of your plugin but you don’t – it has to stay wp-plugin
    so the required command is (for windows):
  • cd path\to\tools\i18n
  • php makepot.php wp-plugin path\to\plugins\your-plugin-name
  • However I got the following error when running it from windows
  • 'msguniq' is not recognized as an internal or external command
  • This appears to be because msguniq is a linux command
  • Actually this had created the your-plugin-name.pot file in the plugin directory but I just didn t realise because of the error
  • If you do want to use makepot.php use grapper’s Github i18n repo which includes all the correct WordPress files rather than the official WordPress subversion directories.

POEdit revisited

Hidden at the end of the WordPress localization document is the excellent ‘Blank WordPress POT’.

That can be put in the plugin’s languages directory and has instructions for creating the PO files from it:

The pot file include in this folder is ready to use.

  1. Double click on it and open it with poedit
  2. In poedit goto File → New Catalog from POT file…
  3. Select and Open the pot file from the languages folder
  4. Enter your name, email address, your language and country (i.e. french fr_FR, german de_DE) to the setting form
  5. Click the update button in the main poedit ui.
  6. Save the file :
  7. For a plugin like filename-xx_XX.po with xx_XX for your language and country.
  8. For a theme xx_XX.po
  9. That’s it, go to your WordPress blog and see your translation in action. N’joy your blogging!!!

This actually worked first time that I tried it – i.e. it pulled in all the translated strings.

First off you can then save the file as a my-plugin-name.pot file in the languages directory. Then you can add in your translations and save it with the locale xx_XX.po file ending and that should generate the .mo file that you need too.

Plugins

Edit: The simplest route was via the Codestyling Localization plugin.

But that is no longer available through the WordPress plugins directly – possibly because it isn’t working or from a security issue. It is still available via the developer’s own site though

Instead you can use Loco Translate which appears to be excellent – but I haven’t tried it.

(These are old instructions for Codestyling Localization) Once it is installed:

  1. Click ‘Add New Language’
  2. Select your locale and it generates an empty .po file
  3. Click ‘Rescan’ to populate the .po file
  4. Click ‘Edit’ and ‘generate mo file’ to get the .mo file out

Dealing with plugin updates

None of this solves the problem that you will lose your translation files once when the plugin is updated.

Copy your .po files into the wp-content/languages/[plugin-name] directory – which might or might not work. If yes, great, if not raise a support query (be nice and give details) with the plugin developer.

This directory is update safe basically because of the translations are loaded with the following priority:

  1. wp-content/languages/[plugin-name] (never changed by updates)
  2. wp-content/languages/plugins/ (changed with updates)
  3. wp-content/plugins/[plugin-name]/languages (changed with updates)

If developers follow this Geert DeDeckere’s excellent guide on loading WordPress language files the right way then there shouldn’t be a problem with using the wp-content/languages directory.

You can also read WooCommerce’s documentation on Making your Localization upgrade safe which recommends either Loco or using /wp-content/languages/woocommerce.

I found these two Support questions useful for learning about handling upgrades: