URL Field encoding in 2.6.5?

Hi, I've submitted this again as it may be hidden in the other thread, plus is very important to me as to whether I continue to try to use WDS 2.6.5 as part of my product. I need to make that decision pretty soon to ship on time. Thanks for any help - even if it's just a 'We are trying to find out, hang on a week...'

From my app I do a search on Outlook emails via the COM API.

In the resultset I get back I need to be able to display the item in Outlook using the StoreId and EntryId values. I can get the Store Id from the first part of the URL field. The EntryId used to be readable, but now, for storage size savings I imagine, it's been encoded.

What is the special encoding of the URL column now

I wasn't that worried too much before, because the entryID was used by 2.5 to be included in the 'Filename' column too, but by the looks of 2.6.5 it's now been blanked out to '...'! i.e. see thread here: http://forums.microsoft.com/MSDN/ShowPost.aspx PostID=293242&SiteID=1

I checked that the URL end-part is not a straight UNICODE conversion, so what do I use I don't mind if this is even a temporary solution, so any clues/hints would be greatly appreciated.

Any help with this really appreciated - I am trying to promote the use of latest WDS afterall :-)




Answer this question

URL Field encoding in 2.6.5?

  • SquadMP

    Thank you Tom - that's just what I needed. I appreciate the info.

    Bill - thanks for following this up.

    - David



  • Mervyn

    Hi David,

    Let me look into this for you. Given your time considerations I'll do my best in getting a quick answer.

    Thanks,

    Bill Connors

    Program Manager, Windows Desktop Search - Communities



  • Annick

    The end of the url is the entryid, but to save space we now encode it.

    The decoding algorythmn looks like this:

    void CEntryId::FromString(__in CString &strEIDU)
    {

    //----------- Decode with YEncode like algorythm -------
    // decode using pseudo YENC encoding, everything is shifted up by 0x30 starting it at the '0' letter.
    // Things that wrap have an escape '!' followed by a 0x60 to put it back up to '0' letter range
    LPCWSTR pszSrc = strEIDU;

    // the first WCHAR is the length of the buffer
    m_cBytes = *pszSrc++ - 0x30;

    // add 2 to the size of the buffer since we assume an odd buffer size could lead to a byte being assigned after
    // the size of the buffer. This is a side-effect of the way we are encoding the buffer
    AllocateBuffer(m_cBytes+2);

    ATLASSERT(m_pEntryId != NULL);

    if (m_pEntryId != NULL)
    {
    // output is to allocated buffer
    LPWSTR pszDest = (LPWSTR)m_pEntryId;

    // for each WCHAR in unicode string,
    int Len = strEIDU.GetLength();

    // (start at position 1 because the zeroth WCHAR is actually the # of bytes in EID)
    for(int i=1; i < Len; i++)
    {
    int offset = 0x30;

    // if it's an escaped char
    if (*pszSrc == L'!')
    {
    // skip escape char
    pszSrc++;
    i++;

    // offset is now 60
    offset = 0x60;
    }
    // restore current WCHAR
    *pszDest++ = (WCHAR)(((ULONG)*pszSrc + 0x10000) % 0x10000) - offset;

    // advance to next position
    pszSrc++;

    ATLASSERT((DWORD)((LPBYTE)pszDest - (LPBYTE)m_pEntryId) <= (DWORD)(m_cBytes+2));
    }
    }
    }



  • URL Field encoding in 2.6.5?