Sep 14 2012

Sidereal Time Calculator

Category: MobileJoel Ivory Johnson @ 04:12
Download Code (1.15 MB)


There are a number of applications available on various mobile devices that allow you to aim your device at the sky and identify various heavenly bodies. I find the ability of identifying a body based on a user's location and the device's orientation fascinating. I find it even more fascinating when I see physical hardware acting on this information; the telescope I have uses GPS (to get my location and the current time) and uses this information to automatically move the telescope to the orientation needed to see selected body.

I now have a pretty good understanding of how it works. A significant amount of the calculations involved are based on time and another part is based on coordinate conversion. With the right time conversions you'll have enough information to get the orientation of the stars. (Getting the orientation of the planets and moon requires a little more work, but the stars have no apparent motion with respect to the solar system). I only want to talk about time for now, as there's more than enough information on time to fill an article.

Table of Contents

Terms and Time Units

Time is usually described in terms of some cyclic process or event where the units of time are from counting the cycles/events. This could be from the vibration of a crystal, the passing of some celestial, or some other event. Historically the apparent motion of the sun and moon have been used as the periodic event around which our time system was based. We've all used the units of time that are derived from these events; hours, minutes, seconds, years, and months along with the terms AM, PM, AD, and BC, and degrees. Let's dissect the physical events behind these terms.

Roman Calendar

The Roman calendar is said to have been invented by Romulus (the founder of Rome) around 753 years BC. This calendar had 10 months with the vernal equinox being the first month. The calendar had 304 days plus an additional number of winter days that stretched from December to the following month that were not part of any month on the calendar.

Julian Calendar

The Julian calendar is a modified version of the Roman Calendar. It has 365 days divided into 12 months. Once every 4 years a leap day is added.  It sounds very much like the civil calendar that we use today with the exception we don't have leap days on years divisible by 400 but not a millennium. With enough time solar and seasons events would begin to creep to other parts of the calendar. This was corrected with the Gregorian calendar.

Gregorian Calendar, Astronomical Year, and Julian Dates

The Gregorian calendar (also called the Western Calendar or Civil Calendar) is the name of the calendar that most of the world knows, loves, and uses today. The namesake for the calendar is Pope Gregory XIII. The number of times the earth rotates during its orbit around the sun is 365 times plus some fractional units (approximately 0.2524).   The Julian calendar made an attempt to correct this by introducing a leap year every 4 years. This contribution slowed the rate at which the seasons migrated on the calendar but didn't stop it all together. Pope Gregory XIII's contribution to the calendar was to not have a leap year if the year was divisible by 400 and is a  millennial year. The last day of the Julian calendar was Thursday 4 October 1582. The day that came after this was the first day of the Gregorian calendar; Friday 15 October 1582. The date range of 5-14 October (inclusive) doesn't exist - something that needs to be remembered for time conversions that cross this boundary.

Julian Date

Another commonly used calendar is counting the number of days since noon of 1 January 4713 B.C. This is also called a Julian date. Noon of 1 January 4713 is Julian date 0. Midnight between 1 January 4713 and 2 January 4714 is Julian date 0.5. Note that the time of day is a part of the Julian date as a fractional unit. For more recent dates the number used to express the Julian date is over 2,400,000 million. To avoid dealing with unnecessarily large numbers there's also the Modified Julian Date (MJD) which counts the number of days since midnight 17 November 1858. Note that MJD starts at midnight while JD starts at noon. So the time units in these two date expressions will have a difference of 0.5 for the digits after the decimal point. You may also hear of a Julian Date Number, which is just the integer portion of the Julian Date.  Dates of this form are of special significance to astronomical calculations.

AD, BC, and Astronomical Year

One of the oddities about the system of tracking years is that there is no year zero. The first year of the calendar , based on the reckoned conception of Jesus Christ of the Christian religion, is 1 AD (AD = Anno Domino, Latin for "Year of the lord", also written as "CE" for Common Era). The year immediately proceeding this is 1 BC (BC = Before Christ. Sometimes written as BCE for Before Common Era). When doing astronomical calculations no one wants to deal with the lack of the zero. So there is also the concept of the Astronomical Year. Astronomical years for the most part align with our current system of tracking years. So 1984 AD is also the Astronomical Year 1984. The difference is apparent when you look at years in the BC range. 1 BC is the astronomical year 0. 2 BC is the astronomical year -1, and so on.

Solar Day

A solar day is the period over which the sun apparently moves on a path and returns to it's starting point. I say apparent because while we know this phenomenon is from the earths rotation. But the movement of the sun is still described in geocentric terms (sun rise, sun set, so on). The sun's path varies slightly from one day to another so it doesn't really return to it's starting point. So the meridian is used as the starting line. The meridian is the imaginary circle around the globe stretching from the north pole to the south pole. The  sun and other bodies reach their highest point in the sky at the meridian and then go from climbing to declining. This line is also used to divide a day in half. Once the sun goes past the meridian the time is labeled as post meridian (P.M.). When it passes this line on the opposite side of the globe we say that it is in it's before meridian. The Latin word for before is "ante", so it is referred to as A.M. (for Ante Meridian). When the sun is on the meridian it is at its highest point in the sky. This is called "solar noon." Solar noon doesn't necessarily occur at the same instance in which the local time is 12:00 PM. There are slight variations in the time at which the sun reaches this point that we tend to ignore with civil clocks.

The pathway that the sun travels around an observer us usually divided into 24 units . Note that if you divide the 360 degrees of a circle by 24 you get 15. These 24 equal units are called hours (in other words, one hour of rotation is 15 degrees). These 15 degrees may also be divided into 60 units (minutes). A minute of rotation is 15/60, or 0.25 of a degree. As you may have guessed the next level of division is to divide a minute into 60 equal parts (the second) which contains 0.26/60 of a degree. An inference that you can make from using hours, minutes, and seconds (HMS) are a rotational unit is that for every hour that passes you can approximate the rotational distance that a celestial body in the night sky will travel over a unit of time; in a 2 hour period an object will travel 60 degrees. I say approximate because if you measure the distance with a high precision you'll see that the sun and moon appear to move by a slightly different amount than 60 degrees over this time period. For casual observations this difference won't be noticeable.

Sidereal Day

If you use the sun as your reference for rotation distance it appears that the earth takes 24 hours to make one full rotation. This isn't quite correct though. The earth moves about 1 degree on its orbit around the sun each day. So the sun shouldn't be used for an accurate measurement of how far the earth has rotated. Any other star will do though. The other stars are far enough away such that their apparent position is the same regardless of where the earth is on its orbit.

Difference in sidereal and solar day.

Choose a star (other than our sun, any star will do). Every time the earth rotates that star will reach the meridian. if you used a wall clock to measure the amount of time it takes for a star to reach the meridian again you'll find it isn't quite 24 hours. It is 23 hours 56 minutes and 4 seconds. Days measured using this method are sidereal days. Because these days are a bit less than 24 hours the amount of solar days in a sidereal year is about 366.25 instead of 365.25. Since a sidereal day is shorter than a solar day on any given solar day there will exist a range of sidereal times that occur twice within the solar day.

Time zones

Our universal time system is based on the time at Greenwich. Greenwich is on the zero longitude. Observations of celestial events on it's meridian was once the foundation of our timing system. It's meridian is also called the prime meridian. As a matter of convenience we also have the concept of local time, which is derived by adding some number of minutes and hours to the time at Greenwich. The earth is divided into 40 regions that share local time. These areas, or time zones, usually have a time difference by some interval of hours from the time at Greenwich (GMT). there are some time zones that are also offset by some hour interval plus 30 minutes. The difference from the most positive offset to the most negative offset is 26 hours. On average the difference between time zones can be inferred by their longitude (recall that one hour is 15 degrees of rotation). However the time zone lines are not straight. Rather than divide small geographic areas into several time zones the time zone borders will coincide with the borders of that geographical region.

The Earth's Celestia Movements

The stars are perceivably in a fixed position. For some one that wants to be extremely technical the stars are moving at speeds that we would find to be incredibly fast relative to our position or relative to the galaxy in which they rotate. But they are so far away that their movement is imperceptible to us, allowing us to treat them as stationary bodies over short periods measured in hundreds of years. There are a few factors that impact the orientation of the stars with respect to an observer on the earth.

Of these movements the one that has the most immediate impact on an observer is the rotation of the earth. Its impacts are directly observable through the apparent path that the sun, moon, and other bodies travel through our sky. If you are looking at a body with a telescope the movement becomes more apparent unless you have a motorized telescope that automatically adjust; as you look at a body it will drift out of the view of your telescope within a minute or less. This is the movement with which I am most concerned.

The Earth advances about 1 degree per day as it travels around the sun. With each day that passes the part of the celestial sphere that becomes unobservable due to competing light with the sun will slightly shift. This will mean that some stars will not be visible during certain parts of the year. While their direction can still be determined with the exception of an eclipse you won't be able to observer these stars during the day. Also note that this impacts the time of sun rise, sun set, and the number of hours in a day in which sunlight is visible (there's less hours of daylight in the winter). For now I'm not particularly concerned with what hours a star will be visible during my general case scenarios. Since I only use my telescope when time and weather unexpectedly permit I don't do much fore planning. If you've got interest in this I would suggest first explore the definitions on the various definitions of twilight (ex: civil vs. nautical vs. astronomical).

The third movement occurs over the course of about 25,700 years. It causes a slight circular drift of the direction in which the earth's rotational axis is pointing.  It can be addressed through a time dependent coordinate space adjustment. But I don't want to talk about coordinate conversions in this post. Just in case you are curious, the Earth's shift of the direction of it's rotational axis is about 1 degree every 71 years, so we can ignore this movement for now and it won't have a significant impact on our results.

Local Sidereal Time

Because of the continually varying orientation of the earth with respect to the sun we don't want to use a solar day for calculations of where stars are located with respect to the earth. The sidereal time is what we want. To get the sidereal time we need to know the Julian date. We'll get the Julian date from the civil (Gregorian) date. I've made a set of extensions for getting these dates. In calculating the Gregorian date you will need to be able to calculate how far we are into a day in decimal format. 12:00 Noon would be 0.5 into a day, 18:00 is 0.75 into a day, and so on. These can be easily calculated from a date or a time.

 static double ToFractionalDay(this TimeSpan sourceTime)
    return sourceTime.TotalHours / 24d;

 static double ToFractionalDay(this DateTime sourceDate)
     return sourceDate.TimeOfDay.ToFractionalDay();

These are written as extension methods because I find the calling syntax to be cleaner.  Now that we know how far we are into a day we can use that information to calculate the Julian date.

public static double ToJulianDate(this DateTime  sourceDate)
    double y, m, c;
    if (sourceDate.Month <= 2)
        y = sourceDate.Year - 1;
        m = sourceDate.Month + 2;
        y = sourceDate.Year;
        m = sourceDate.Month;

    double leapDayCount  = (sourceDate > GregorianReformDate) ? (2 - Math.Floor(y / 100) + Math.Floor(y/400) ) : 0;
    if (sourceDate.Year < 0)
        c = (int)(365.25 * (double)sourceDate.Year - 0.75);
        c = (int)(365.25 * (double)sourceDate.Year);
    double d = Math.Floor(30.6001 * (m + 1));
    var retVal = leapDayCount +c+ d + sourceDate.Day + 1720994.5;
    return retVal + sourceDate.ToFractionalDay();;

There's something I've not mentioned. All of these calculations are centric to the 0 longitude and are based on the GMT time zone without daylight savings. If you want to adjust the results to figure out the orientation of your time zone with respect to the rest of the observable universe you'll need to make an adjustment for your longitude. If your longitude is to the west of GMT express it with a negative number, otherwise use a positive number. Divide this number by 15 and add it to the sidereal time.  I live 84 degrees west of the 0 longitude. So to get the local sidereal time I do the following.

localSiderealTimeClock.CurrentTime = DateTime.Now.ToUniversalTime().ToSiderealTime().Add(TimeSpan.FromHours(-84d/15d));

The local sidereal time describes your rotational displacement from the direction of the vernal equinox (♈). While there's no up in space the direction formed by drawing a line from the sun to the earth while it is in the vernal equinox is the foundation of a couple of celestial coordinate systems (Ecliptic, which is based on the earth's orbit around the sun and equitorial which is based on the earth's rotation).

Correcting Variance's in the User's Clock

User's both intentionally and unintentionally may have their clocks set to an incorrect time. One way of avoiding problems from this is to make use of NTP (Network Time Protocol). I've written on obtaining NTP time before. You can read about it here.  While it is possible to continually poll an NTP source for the time I only grab it once every few minutes. When I get the NTP time the difference between the user's close and the NTP time source is saved and added to the value that comes from the user's clock. The expectation is that between refreshes for the NTP time the user's clock will reliably count seconds without any significant drift (if it doesn't, then the user needs a new device!).

NtpClient _ntpClient;
TimeSpan _ntpOffset;
DateTime _lastNtpRefresh = DateTime.MinValue;
TimeSpan _ntpRefreshPeriod = TimeSpan.FromMinutes(1);

public MainViewModel()
    _ntpClient = new NtpClient();
    _ntpClient.TimeReceived += new EventHandler<NtpClient.TimeReceivedEventArgs>(_ntpClient_TimeReceived);
    //Default the difference to zero and provisionally assume the user's
    //clock is correct until we receive information of otherwise
    _ntpOffset = TimeSpan.Zero;

void _ntpClient_TimeReceived(object sender, NtpClient.TimeReceivedEventArgs e)
    _lastNtpRefresh = DateTime.Now;
    DateTime NtpTime = e.CurrentTime;
    // NTP time is always in universal time, so we need to adjust the system clock 
    // to universal before getting the time offset. 
    _ntpOffset = NtpTime.Subtract(DateTime.Now.ToUniversalTime());

//Use thie method to get time adjusted for NTP offset.
DateTime GetDate()
     return DateTime.Now.Add(_ntpOffset);

Displaying the Time

If you've looked at clocks that show the time in more than one time zone chances are the numbers shown for minutes and seconds were the same for most of the time zones. This isn't the case when looking at both civil time and sidereal time. The seconds will be out of sync. Because of personal preference (I simply find this displeasing) I'm updating the seconds simultaneously. I've made two controls for displaying the time; an analog clock and a digital clock. Both can display the time in 12 hour or 24 hour format.


Digital display of sidereal clock.

Displaying the 24 hour time with an analog clock may be new to many. I took a look at several 24-hour analog clocks in images online. Some started with midnight at the top of the clock and others with midnight at the bottom. I decided on having the midnight (0) hour at the bottom. This places noon at the top of the clock. Displaying 24-hour time in sidereal format is something that I'm still playing with though. While I have a circular gauge-like clock in place I'm going to change this from a user control to a templated control and expose new options on how it's to be rendered. (hints of the forthcoming changes are visible in the source code).

Display of analog class
Options screen

Help Files

In experimenting with something else I've included a Help HTML file for the application. The help file is stored in the application as content but unpackaged the first time the application is run. To prevent the unnecessary unpacking of files every time the application run it checks to see if a file already exists before unpacking it.

public class ContentUnpacker
    static string[] ContentFileList = { "About.html", "Sidereal.png", "appTimes.png", "settings.png" }; 
    public static void UnpackAllFiles()
        IsolatedStorageFile sourceArchive = IsolatedStorageFile.GetUserStoreForApplication();
        if (!sourceArchive.DirectoryExists("Content"))

        foreach (string s in ContentFileList)
            string targetName = String.Format("Content/{0}", s);
            string sourceName = String.Format("Content/{0}", s);
                var outStream = sourceArchive.CreateFile(targetName);
                var contentStream = Application.GetResourceStream(new Uri(sourceName, UriKind.Relative));
                using (var br = new BinaryReader(contentStream.Stream))
                    var length = (int)br.BaseStream.Length;
                    outStream.Write(br.ReadBytes(length), 0, length);

The about page contains only a web browser element that is given the URL to the help files. The entirity of the code that's behind the about page is below.

public partial class AboutPage : PhoneApplicationPage
    public AboutPage()

    private void PhoneApplicationPage_Loaded(object sender, RoutedEventArgs e)
        aboutBrowser.Navigate(new Uri("Content/About.html", UriKind.Relative));

Where to from Here

There's a number of applications, some related to astronomy and others not that I have in mind for which this functionality will be useful. One example of something not realted to astronomy was an augmented reality application I had in mind for which I wanted the application to shade the models projected on the screen according to the location of the sun. One of the astronomy related applications is that I have access to a room with projectors and screens on all 4 walls. Just for the fun of it I wanted to to get the computers that control the projectors on all 4 walls communicating with each other and displaying a 360 view of the solar system. Getting the sidereal time is a stepping stone for some of these other applications but not the end goal itself. I plan to write on how these other ideas progress as each one of them gets implemented.

Figures and Illustrations

Revision History

  • 2012-09-14 - Initial publication





Tags: , ,

Jan 29 2012

Sparse Array class for .Net

Category: Desktop and Server | MobileJoel Ivory Johnson @ 03:09

Download Code(312 Kb)


I had the need for a dynamically growing sparsely populated array. "Sparse" implies that there will be a lot of elements that contain empty values between the ones that contain non-empty values. The .Net collections namespace doesn't contain anything that meets this need. The ArrayList class dynamically grows, but it would have elements allocated for the empty and non-empty values alike. I plan to use this code on devices that have limited memory so this wouldn't do. I ended up making my own class to satisfy this need. While my initial need for this code is for byte arrays I made the code generic so that it can be used with other data types.

How Big is this Class?

Since I will be talking about memory allocation I'll need to also talk about how to get a sense of how much memory that an instance of something costs. This won't account for 100% of the memory that is consumed by by allocation of the object. But it will be close enough for you to start making judgements about if one object cost more memory or less memory than another.


There are a number of predefined value types for which the size is well known. Here are some of the most common ones.

TypeSize (in bytes)
byte 1
int 4
short 2
float 2
double 4
char 2

If you build a struct it is a value type too. The size of a struct will be about the sum of the size of its members. Here is an example.

struct Location
    public int LocationID;   // 4 bytes
    public double Latitude;  // 4 bytes
    public double Longitude; // 4 bytes
    public double Elevation; // 4 bytes

The size of this structure is 16 bytes. Now what happens if you add a reference type (something that is defined as a class instead of a struct)? How big will it be? I'll add a string to the previous example.

struct Location
    public int LocationID;      // 4 bytes
    public string LocationName; // ?
    public double Latitude;     // 4 bytes
    public double Longitude;    // 4 bytes
    public double Elevation;    // 4 bytes

There are two elements of memory to consider for the reference field. There is the size of the reference and the size of the object to which it is pointing. Think of the LocationName field in the example above as holding a memory address to the area where the string is being held. The size of this memory address will depend on what type of processor architecture that the code is running against. If the code is JIT compiled for a 32-bit system then the refernece will be 32-bits (4 bytes) in size. If it is JIT compiled for a 64-bit system then the refenrece will by 64-bits (8 bytes) in size. When I am working with just a desktop then I do my calculations based on 64-bit systems. But the code on this article will run on Windows Phone so I will be taking both into consideration. There is also other elements within an objects header that is around 8 bytes. The other element of memory to take into the consideration is the size of the string itself. If the string has not yet been assigned and the element is null then the second element of size to consider will have a size of zero. If the string is assigned a value then the second element will be what ever memory is consumed by the string.

It is possible for multiple structs to refer to the same instance of a refernece type. When this occurs you'll want to make sure that you don't count the memory taken up by the instances of the reference type multiple times.

Now let's add an array and see what that does for our memory calculations. The array itself is a reference type. so the amount of memory the reference to it will consume is dependent on the processor architecture.

struct Location
    public int LocationID;         // 4 bytes
    public string LocationName;    // 4/8 bytes + string memory
    public double Latitude;        // 4 bytes
    public double Longitude;       // 4 bytes
    public double Elevation;       // 4 bytes
    public int[] SomeRandomArray;  // 4/8 bytes + array memory

The size of the memory associated with the array will be the sum of the size of the elements that it contains. If the array contains value types (the above contains a value type of int) then the memory allocated when the array is initialized is sizeof(int) (4 bytes) multiplied by the number of elements in the array. If the array contains reference types then the memory consumed by the array will be the size of the reference (4+8 bytes on a 32-bit system, 8+8 bytes on a 64-bit system) multiplied by the number of elements that the array can hold. This doesn't include the size of the individual instantiated instances of elements it contains.

What happens if I change any of the above from a struct to a class? Once a type is made into a reference type it is going to also have an object header. For a struct when you declare a variable all of the memory that is going to be used by the immediate members is going to be allocated. With a class only the memory for the reference (4 bytes on 32-bit, 8 bytes on 64-bit) is allocated. The memory needed for the members of an reference type is not allocated until there is a call to new. Also note that the memory for the instances of a reference type are allocated on the heap while for a value type they are allocated on the stack.

A Look Sparse and Contiguous Memory Allocation

If we take a look at how memory is allocated for a sparse array and a contiguous (normal) array the reason I needed this will be more clear. As an oversimplified example let's say that that an array of 40 elements must be allocated. Regular arrays will allocate the memory contiguously; the bytes associated with the array will be in the same block of memory. Lets say our sparse array is allocating blocks of memory for five elements at a time. Below the blocks that are in blue represent areas in which there is non-empty data.

Contiguous Array
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
Sparse Array
00 01 02 03 04
05 06 07 08 09
20 21 00 00 00
00 00 00 00 34
35 36 37 38 39

At a glance the sparse array is taking up less memory than the conventional array. There are still some empty elements within the structure. This is because it's allocating memory for a range of index. If there's even a single non-empty element within a 5 block range then there's an allocation for all 5 blocks. Whether or not this results in less memory being allocated overall is going to depend on the usage patterns for the sparse array. It will work best when the populated elements are clustered closely to each other.  Can we get rid of the empty elements all together? We could by reducing the size of the chunks allocated. The smaller the chunks the lower the opportunity for empty positions within chunks to exists. If memory was allocated for single elements (a chunk that only holds one element) then we would only have populated elements in the list. But is this better?

It may be. But to get a better answer to this question there's a cost that as of yet has remained invisible for the sparse array. There's a structure for holding  each chunk that looks like the following in my implementation.

public class ArrayChunk<T>
    public int StartIndex { get; set;  } //4 bytes needed to contain an integer value

    public int EndIndex
            if (Data == null)
                return -1;
            return StartIndex +  Data.Length-1;
    public T[] Data { get; set;  }

In addition to the array that holds the elements of the chunk itself there is also a field to hold the start index for the array. The start index consumes 4 bytes of memory. So there is an overhead of no less than 4 bytes for each chunk allocated.  Consider a conventional and a sparse array both of which are fully populated with 100 bytes of data. Also assume that the sparse array allocates memory for 10 bytes at a time. The memory consumed by the conventional array will be about 100 bytes. The memory consumed by the sparse array will be around 140  bytes. If I reduced the size of the chunks to only having data for 2 bytes of memory then we end up needing at least 6 bytes for each chunk. For the fully populated 100 element collection this would translated to no less than 600 bytes. With results like that one may wonder why even bother with the sparse array.  But consider another scenario. Lets say that only 20 elements of the array are populated with 10 elements at the begining of the array and 10 at the end. For the conventional array the memory consumed will still be at least 100 bytes. For the sparse array it is around 28 bytes. Something that may become apparent is that the best case scenarios for this sparse array will occur when it partially populated and the populated data items are clustered close to each other.


The conventional array cannot grow. Once allocated its size is fixed. There are other collection classes within .Net that can grow such as the List<T> derived classes or the MemoryStream class. My understanding is that once the memory buffer for any of these classes is consumed it will allocated a new memory buffer of twice the size as the what it had, copy all of it's data to the new buffer, and then discard the old buffer to be reclaimed by garbage collection later. In trying to confirm this I found the source code for the MemoryStream class. The code of interest is below



            private bool EnsureCapacity(int value) {
            // Check for overflow
            if (value < 0)
                throw new IOException("IO.IO_StreamTooLong");
            if (value > _capacity) {
                int newCapacity = value;
                if (newCapacity < 256)
                    newCapacity = 256;
                if (newCapacity < _capacity * 2)
                    newCapacity = _capacity * 2;
                Capacity = newCapacity;
                return true;
            return false;

        // Gets and sets the capacity (number of bytes allocated) for this stream.
        // The capacity cannot be set to a value less than the current length
        // of the stream.
        //| <include file='doc\MemoryStream.uex' path='docs/doc[@for="MemoryStream.Capacity"]/*' />
        public virtual int Capacity {
            get {
                if (!_isOpen) __Error.StreamIsClosed();
                return _capacity - _origin;
            set {
                if (!_isOpen) __Error.StreamIsClosed();
                if (value != _capacity) {
                    if (!_expandable) __Error.MemoryStreamNotExpandable();
                    if (value < _length) throw new ArgumentOutOfRangeException("value", "ArgumentOutOfRange_SmallCapacity");
                    if (value > 0) {
                        byte[] newBuffer = new byte[value];
                        if (_length > 0) Buffer.BlockCopy(_buffer, 0, newBuffer, 0, _length);
                        _buffer = newBuffer;
                    else {
                        _buffer = null;
                    _capacity = value;

This behaviour is just fine for typical scenarios, but I will be working with what are relatively large buffers (in comparison to the memory available on the devices on which I will be running my code). So I'd prefer to keep the ceiling for the maximum amount of memory allocation that occurs within the programs that I have in mind. It's also worth mentioning that the .Net runtime treats "large" objects differently than it does small ones. For more information take a look at The Dangers of the Large Object Heap. Large objects (around 85,000 bytes or larger) or allocated on a separate heap than small objects (under 85,000 bytes). During garbage collection the .Net garbage collector will try to condense the objects in the smaller heaps to being in contiguous memory. Objects in the LOH (Large Object Heap) are not as easily addressed. From the referenced article:

Large objects pose a special problem for the runtime: they can’t be reliably moved by copying as they would require twice as much memory for garbage collection. Additionally, moving multi-megabyte objects around would cause the garbage collector to take an unreasonably long time to complete.

.NET solves these problems by simply never moving large objects around. After large objects are removed by the garbage collector, they leave behind holes in the large object heap, thereby causing the free space to become fragmented. When there’s no space at the end of the large object heap for an allocation, .NET searches these holes for a space, and expands the heap if none of the holes are large enough. This can become a problem. As a program allocates and releases blocks from the large object heap, the free blocks between the longer-lived blocks can become smaller. Over time, even a program that does not leak memory, and which never requires more than a fixed amount of memory to perform an operation, can fail with an OutOfMemoryException at the point that the largest free block shrinks to a point where it is too small for the program to use.

There are improvements in the LOH in .Net 4.5. Also note that on devices with more constrained memory (such as Windows Phone) there is no LOH. But there are still advantages to avoiding fragmented memory conditions.

Collection of Chunks

While the code I've written is mean to be a type of collection class the code is still dependent on a collection class for holding onto the chunks that it has. It is possible to use one of the List<T> classes for this or a Dictionary<T1,T2> for this. I've decided to go with the List<T>. Now doesn't it look dubious that I talked about the memory usage patterns of the List<T> class and now I'm using it in my underlying implementation! Isn't that just going to cause the same problem that I described above with memory fragmentation? Well, no, at least not as severe. The List<T> contains references to ArrayChunk instances. So these will either be 4 bytes or 8 bytes. Let's assume the worst case which is 8 bytes. To grace the large object boundaries the array list would need to grow to more than 10,000 elements ( (85,000 / 8)=num of items needed to make the ArrayList large. 85,000 is the large object size and 8 is the amount of bytes needed to store a reference). The number of elements needed to make the array list this size is going to depend on how many elements you allow to be stored in each ArrayChunk. When the array does become large enough to occupy the LOH area the scenario is still better than what would happen with a conventional array since the block of memory occupying the LOH is smaller than the block of memory that would have been occupied by a contiguous array of the elements.

For what the sparse array is capable of doing in the code presented with this writeup the LinkedList<T> would have been suitable (actually more suitable). The List<T> that I use is primarily stepping forward and backwards in the list (I wrote the code making the assumption that read and writes to the list will tend to be clustered close to each other). I used the List<T> because it seems to be a better fit for some modifications I plan to do to this code in the future. I won't detail the details of those plans now since they could change. Consequently of those plans do change then I may swap out the List<T> with the LinkedList<T>. When I do make changes I want to ensure that I don't break any of the existing behaviour of the class. The project for this code also contains a few unit test to validate that the behaviour of the ArrayList<T> doesn't vary from expectations.


There's a small test project included with the code. It will become more important as time goes on because as I make additions to this code I want to ensure that I don't break any of the behaviours that are already present. Right now the test are just checking against the virtual size of the array, ensuring the memory chunks are allocated or deallocated as expected, and that data is preserved.

What is EmptyValue For?

In the sparse array cells that have not been written to are to be treated as though they exists and they contain some default value. That default value is stored in EmptyValue. There are multiple uses for this member. If there is an attempted read from an unallocated cell EmptyValue is returned. When the sparse array is searching for chunks to deallocate it will check the contents of that chunk to see if all of its elements are equal to EmptyValue. When a new SparseArray is instantiated the EmptyValue field is initialized by calling the default constructor/initializer for the type that it hosts. For the numeric types this will end up being a zero value. If this isn't appropriate for the data type that you are using there is a delegate named IsEmptyValue to which you can assign a method that returns true to indicate that a value or instance should be considered empty and false otherwise.

Using the Code

Use of this code is not much unlike how you might use a strongly typed fixed size array. The main difference is the upper bound on the SparseArray<T> grows as needed to accomodate writes to positions within the least. Reading from locations beyond the upper bound will not result in an exception. If some one reads from an indix that is above the size of the array it remains unchanged. But if someone writes to a location above the upper limit then the array will update its reported size and allocate an ArrayChunk if needed.

Class Interface

Name Access Description
ChunkSize int Indicates the number of elements that each ArrayChunk can hold
Length int Returns the virtual size of the sparse array
ChunkCount int Returns the total number of chunks that the sparse array has.
this[int index] (Generic) Indexer for accessing contents of array
Name Description
Condense Searches through the sparse array and removes the chunks that contain only empty values
ctor() default constructor. Will create a sparse array with a chunk size of 256 elements
ctor(int size) Creates a sparse array with a chunk size that is determined by the size parameter passed.
ctor(int size, int chunkCount) Creates a sparse array with a chunk size that is determined by the size parameter passed. The chunkCount may be used as a hint for preallocating some memory.

Usage Example

 //Initialize an instance of the array
SparseArray<int> myArray = new SparseArray<int>();

//Populate the array with a few elements
myArray[15] = 23;
myArray[1024] = 2;

//Print the size of the array. Will be 1025 (remember this is a zero based array)
Console.Out.WriteLine("Virtual size of the array: {0}", myArray.Length);

//Read from a position beyond the limit
Console.Out.WriteLine("Value in position 2048 : {0}", myArray[2048]);

//Print the size of the array. Will be 1025 (remember this is a zero based array)
Console.Out.WriteLine("Virtual size of the array: {0}", myArray.Length);

//Show how many chunks the sparse array has. 
Console.Out.WriteLine("Chunk count: {0}", myArray.ChunkCount);

//Set the only populated element in the second chunk to zero and then condense the array
myArray[1024] = 0;

//Check the chunk size again. It should be reduced. 
Console.Out.WriteLine("Chunk count: {0}", myArray.ChunkCount);


Oct 21 2011

Modified Julian Date

Category: Joel Ivory Johnson @ 04:58

Events related to the solar system and astronomy are often measured with modified Julian dates (MJD). Like regular Julian dates, MJDs are a count of the number of days since some specific date. But Julian dates start at noon whild MJDs start at midnight. The day from which they start counting is also different. 

I was looking for a conversion algorithm and didn't quite like the one's I came across. They were all written for C/C++ and the components of the date and time had to be passed as seperate parameters. There were also some features in .Net that allows one to make the code more readable that just were not available in C/C++. So based off of several other algorithms I encountered I came up with the following. 


double TimespanToDayPortion(TimeSpan source)
    return source.TotalHours / 24;
double TimeToModifiedJulianDate(DateTime sourceTime)
    int calcMonth, calcYear, calcDay;

    calcDay = sourceTime.Day;
    if (sourceTime.Month < 2)
        calcMonth = sourceTime.Month + 12;
        calcYear = sourceTime.Year - 1;
        calcMonth = sourceTime.Month;
        calcYear = sourceTime.Year;
    var leapDays = (calcYear / 400) - (calcYear / 100) + (calcYear / 4);
    var mjd = 365L*calcYear - 679004L + leapDays + (int)(30.6001*(calcMonth+1)) + calcDay;
    return mjd + TimespanToDayPortion(sourceTime.TimeOfDay);

I've tested this against some other online calculators and they were getting consistent results

For the sake of simplicity I didn't account for the Gregorian calendar reform. As a part of the reform 4 October 1858 was followed by 15 October 1858. I only need this algorithm to do calculations for the current century, so not accounting for that has no impact on my needs.

I'll have to use this algorithm later for one of the augmented reality example programs and will be referring back to it later.