Designing app-centric sharing for SkyDrive, part 2 of 2: Rebuilding permissions

Designing app-centric sharing for SkyDrive, part 2 of 2: Rebuilding permissions

  • Comments 9
  • Likes

One of the biggest changes we made to the recent SkyDrive release was how we deal with permissions on files and folders. Making these underlying changes to our service without impacting customers is a bit like replacing the engines on an airplane while it’s flying. The technical challenges were tremendous, but the end result is a system that allows far more flexibility in how you share your files and photos. This post was authored by David Nichols, Software Development Lead for our Storage system, and discusses the technical challenges in making app-centric sharing possible.

-Omar Shahine, Group Program Manager, SkyDrive.com

Our latest releases of SkyDrive include a major revision to our sharing system that lets you give other people permission to see—or even edit—your documents and photos. These releases involved a lot of work in both our front-end web system, which implements the user interface to SkyDrive.com, and our back-end file system, designed to provide persistent storage for your documents and photos. You can also see this capability in SkyDrive for Windows Phone and iPhone in the form of “view-only” and “view and edit” link sharing. Along the way we had several design challenges, and in this post  we’ll look at three of them: Sharing your data with people who don’t use Windows Live, sharing your data from anywhere in your file tree, and finding the files that people have shared with you.

Share your data with anyone

Social networks were still new when we first designed SkyDrive. Facebook wasn’t available outside of universities; MySpace was in its heyday; the idea of integration between networks was a long way off. We expected the sharing patterns to be either sharing with a specific list of contacts in Windows Live or with Messenger buddies. In particular, it was awkward to share with someone who doesn’t have a Windows Live account. The solution to this problem lies in the way we represent sharing permission for files and folders.

Every file or folder in SkyDrive has an optional “access control list” that shows who’s allowed to read or edit the file or folder. You can apply permissions at the folder level (which means that everything inside the folder has the same set of permissions), or you can apply different permissions to individual items inside the folder. This is similar to how enterprise systems (such as Microsoft Windows) track permission information, but our system has a twist.

In addition to being able to hold entries such as “user x” or “buddies of user y,” our system can also hold “token-based” access items. A token is just a string of random (and thus unguessable) bits. If you know the bits, you can gain whatever access the token gives you. We embed these tokens in URLs and send them out in the invitation email when you share a file. When the recipient clicks the link in the invitation, they either get direct access to the file, or get the option to add their Windows Live ID to the access list for the file.

Here’s an example of how this works

Let’s say that Alice wants to share her famous fried okra recipe with Bob, Carol, and David. She knows their email addresses but only has a Windows Live ID for Carol, who is one of her Messenger buddies. Alice uses the Share dialog on the file “Fried Okra.docx” and enters the email addresses for Bob, Carol, and David. After sending the invitation, the access list for “Fried Okra.docx” looks something like this:

Who Access Comment

Token 23 (the real ones are longer)

Read

‘bob@contoso.com’

carol@hotmail-example.com (a Windows Live ID)

Read

 

Token 51

Read

david@contoso.com

Bob gets an email with the token URL, and simply uses it to read the document. As long as he saves the email, he can continue to use that URL (unless Alice changes her mind, see below). Carol uses the URL and logs in with her Windows Live ID. By doing so, not only can she see the document, but it shows up on her “Shared With Me” list whenever she uses SkyDrive. David has a Windows Live ID that Alice didn’t know about, so when he uses the URL, he’s able to substitute his actual Windows Live ID for the token and also see the okra recipe in his “Shared With Me” list. At this point, the access looks like this:

Who Access Comment

Token 23 (the real ones are longer)

Read

‘bob@contoso.com’

carol@hotmail-example.com (a Windows Live ID)

Read

 

david@live-example.com

Read

david@contoso.com

Why the comments? Their purpose is to help with revocation. Say Alice has a change of heart about sharing and wants to remove access from Bob and Carol. When she goes to edit access for the document, she needs to see something more informative than “Token 23.” Because the system remembered the original recipients the tokens were intended for, Alice can chose the correct items to remove from the access list. Once the token has been revoked, the URL in Bob’s saved email will stop working.

Share your files without moving them

The old sharing system for SkyDrive was optimized for the way we expected people to use the system at the time. SkyDrive was used mostly for sharing photos, so we wanted to make it as simple as possible to share an album at a time. We understood that tracking what was shared and what wasn’t could get complex, so we limited the possible “sharable things” to top-level albums in someone’s SkyDrive.

As we added support for storing, editing and finding Office documents, we realized that this simple sharing model wouldn’t capture the sharing patterns our users needed. As Tony East mentioned in his post Designing app-centric sharing for SkyDrive, part 1 of 2: Complexity of “simple,” the ability to share shouldn’t depend on file organization. You should be able to point to any file, anywhere, and share it without moving it.

The problem with this lay in an early decision to store file access information in a different service than the SkyDrive backend. Until this release, the access lists for folders were stored in our contacts and relationships system, ABCH. While this made sense in light of the scenarios at the time, the new sharing model was going to cause scaling issues, because every shared file in SkyDrive would require data in ABCH.

To get the access lists back in SkyDrive, we needed a data migration. Data migrations are quite complicated in large scale on-line systems, because the user data is partitioned across many servers in our data centers. Both SkyDrive and ABCH partition the users across servers, but we use different patterns to do so. So while Alice and Bob’s data might be on the same server in SkyDrive, their data is likely on different servers in ABCH.

We know how to do this: start up a set of migration tasks in our job system, have them examine each user individually, and then move that user’s data. Because we’re moving data from one system to another, this can take as long as few months to complete. To speed up the effective migration speed, we used a local-to-SkyDrive pass that tweaked our internal data format to support on-demand migration. As soon as this was done, we were ready to support the new features. If a user edits sharing on an existing folder, we bring the data for that folder over right away. In the meantime, our migration job is moving all the data, whether it’s changed or not.

Find what’s shared with you

Another feature of our sharing system that’s different from conventional file systems is the “Shared With Me” list. While you can save all the invitation emails you get that are letting you know about files your friends have shared, we’ve found that it’s great if the system can manage this list for you. Because we partition our file data on servers by the user who owns the data, this isn’t trivial to do. If ten people share files with Alice, the access lists for those files are on ten different servers out of hundreds in our system, so there’s no one good place to go to for the list. To solve this problem, our implementation builds on our full-text indexing system, so let’s take a look at that.

Full-text systems work by taking documents in the system and finding all the words in each. From this, they create “inverted indices,” which have words and the corresponding list of documents that contain those words. For example, there might be an entry like “okra: 1,7,107,243,512,514,…” and another, “recipe: 3,56,107,201,512,703,…” which means that the word “okra” appears in the first, seventh, 107st, 243rd, etc. documents, and that “recipe” appears in the third, 56th, 107th, 201st, etc. documents. To find all documents with “okra” and “recipe”, we take the intersection of the two lists (which is easy, since they’re in order), and discover that the 107th and 512th documents contain both words. 

SkyDrive Full-Text Index

For SkyDrive, we have a full-text index of all documents in the system. However, we can’t let people see all the documents in a search result, only the ones they are allowed to view. To do this, we index the Windows Live IDs of the allowed viewers onto the documents as well. In addition to the word entries above, we add special strings to the documents that get indexed much like the words do, but which encode the permission data. For example, the string “VIEWER=carol@hotmail-example.com” would mean that Carol has view permission for a specific document. Then the inverted index gets an entry like “VIEWER=carol@hotmail-example.com: 39, 107, 762, ...” When Carol searches for “okra recipe,” we change the query to “okra recipe VIEWER=carol@hotmail-example.com.” So Carol gets document 107 back, but not document 512, which she isn’t allowed to read.

With this index, an obvious way to implement “Shared With Me” is to search for the documents Carol is allowed to read. This isn’t exactly right, but it’s close. First, we want to exclude documents that she owns, because we’re showing them elsewhere. Second, we need to include photos, which normally aren’t in the full- text index. Finally, we don’t really want all the files Carol has access to, but only the files or folders where someone explicitly added Carol. If Alice shares a folder with 100 documents, we want only the folder to show up in Shared With Me, not all 100 of the contained documents. If she shares a single spreadsheet, we want to show it too.

The answer to these problems is to index all the shared files or folders with a second index field which tracks exactly the documents and folders that got explicitly shared. This field is only on the shared items, not on files contained within folders, and doesn’t include the document owner. Our search is then for “SHARED-WITH=carol@hotmail-example.com,” which gives us exactly what we want.

Moving forward

Our changes in the system are a big step forward in our ability to support our sharing scenarios, but we know we aren’t done yet. As we collect feedback from you, we’ll continue to evolve how the sharing system works. With this work, we think we’re in a good spot to move forward rapidly.

David Nichols

Software Development Lead, SkyDrive.com

9 Comments
You must be logged in to comment. Sign in or Join Now
  • danielgr
    73 Posts

    Knowing that no-one will ever read my previous detailed post, I'll summarize it:

    Context : I've been using skydrive forever & for everything.

    What I love of the new release:

    - File handling, HTML5, Captions and slideshow are back !

    The many things you broke and I miss an awful lot:

    - Social Messenger (the mother of all things bad you did)

    - Permissions.

    - Share with more than 20 persons  without creating a "free access link".

    - File modification criteria inconsistent with 30 years of personal computing.

    - "thumbnail view" sorting criteria can't be modified, and fixed to "modification date" instead of "taken date" for photos.

    - Root folder for "Documents" (.^Documents), but none for "Photos"

    Extremely disappointed so see that you've basically killed all the social capabilities of Skydrive, which has now become a "storage space" to fancily host "e-mail attachements" (be them photos or docs).

  • danielgr
    73 Posts

    Context : I've been using skydrive since it was born before "the social network era", through all and every single one of its versions. I have also got my family little by little into it so that we can mainly share our digital experiences.

    What I love of the new release:

    - The new file handling interface is what many have been waiting for years: light, fast, and flexible (renaming, moving, copying, etc.). Would you add the possibility of drag & drop between folders and move/copy to different accounts and it'd be nearly perfect.

    - The whole HTML5 glory, mp4 streaming video support is amazing.

    - Flexibility to share folders and files around individually is nice, integrated view with groups, recent files, etc.

    - Captions and slideshow (though still limited to browser) are back !

    The many things you broke and I miss:

    - Permissions. Not being able to share without sending an e-mail is a pain. There are simply millions of times when one wants to share something with a limited group of individuals but not necessarily send them an e-mail. I understand this goes like this simply because you've turned off Social Messenger capabilities.

    - Social Messenger. My greatest loss. Not only because nothing I do in skydrive shows in Messenger/Windows Live anymore, but also in my Windows Phone, etc. Now my family keeps adding albums/photos which are shared with me, but I simply never notice. Before my live tile/messenger would simply "update", which was actually one of the main advantages of Windows Phone/Windows Live. With the latest release Windows Live as a social network is dead, and for people that had made the effort of consolidating their social activity into it it's hard to swallow.

    - Not being able to share with more than 20 individual persons (at least that's what the interface tells me when sharing by e-mail) without creating a "free access link".

    - File modification criteria : Moving files from one place another, or renaming them, results in a "file modification", which changes the order items are displayed in most views (specially the non-configurable thumbnail-view). This is annoying and inconsistent with any OS I could imagine. Files should be registered as "changed" only when the file/folder are modified. Changing the name or location are not "modifications".

    - Not being able to sort my items by fixed criteria (date, name, etc.) when in "thumbnail view".

    - Photos automatically sort by "modification date" instead of by "taken date". That is specially annoying when combined with the web update dialog and "modification criteria", because they basically get sorted in the order they were uploaded, which becomes random since 4 simultaneous uploads are done. As a result, I now have to upload my photos one by one if I want to get a consistent order without the need of one-by-one custom re-ordering later. Again, "uploading a file" should not count it as "modified file". The file is not modified, it's simply "new in skydrive".

    - There is a root folder for "Documents" (.^Documents), but none for "Photos". Again, this is inconsistent and misleading. When you first updated to the last version of skydrive I had all my album photos mixed with my Documents in the root, made it difficult to find the later (for example with your new skydrive WP7 app, which lacks "filters"). Ended up creating a "Photos" folder myself ...

    All this is specially annoying because after years waiting skydrive was finally becoming a nice tool for sharing my digital life. The amazing integration showed in WP7 was promising, made me buy it for both myself and wife, recommend it to family and friends. Now you've basically killed all social capabilities that were supposed to allow me sharing "on the go" without the annoying "sending e-mail" stuff, force me to use WP7 as I would use any other mobile phone, use a separate app to upload things to skydrive (in due quality), use e-mail to share my photos, etc. Hope these changes allow you to grow your user base, because at this point I believe many of your old installed base will keep on giving up on you. It simply seems that you never have a clear idea of what you want skydrive to be, keep throwing stuff as if we were all BETA testers and changing it afterwards. It makes all the good stuff you do feel bad, which is a pity.

  • I love skydrive and REALLY like the new permissions settings.  A couple of requests:

    1) When I have multiple files selected, I should be able to share them right? Please add multiselect to the thumbnails view.  This would be far superior rather than go through the sharing process for each individual file.   See, I want to upload ALL our photos to a directory but only want to share about half with my family.  Still very tedious to do this.

    2) PASS Along to WP7 team: When sharing a link within the pictures hub or skydrive app, it doesn't actually send the unique token link.  I shared a folder with another Windows Live user and it didn't show up on their shared items.  

    3) The ability to share with a group of individuals. If I add a contact to that group, it should enable that person to see the shared folders.

    4) A page thas shows a simple list of files and who they are shared with.

  • nektar
    8 Posts

    Clearly, some members of your team are good at paying attention to and ironing out the details. However, some of them are not so cabable perhaps.

    I wanted to set up a Hotmail account to also retrieve e-mail messages from a POP account. I found several mistakes in the user interface which show a lack of attention to detail.

    1. When you view the options page which includes the functionality to set up POP mail retrieval, there is another feature listed on that page, which enables you to setup another e-mail address which you can choose to send mail from. However, the interface does not say or imply that enabling the first service, namely retrieving mails from a POP account, would also add its corresponding e-mail address in the from field, i.e. it would enable the second feature too.

    2. Then, when you start the process of entering the POP account's settings, there is a Privacy link on that page which brings up the Microsoft Privacy statement. Why looking at a general privacy statement at that time is necessary, is neither explained nor implied. Perhaps it has something to do with remembering the POP account's password, but I don't want to have to scroll through the long privacy statement to find that out and even if I do, I can't be sure that this was the reason for linking to this general-purpose statement.

    3. The worse thing of it all is when you click the Privacy link, the statement does not appear in a new browser window but browses away from the options page. Clicking the Back button to return back to the options page gives you an IE page that says that the previous page has expired. What a horrible experience.

    4. Also, on the POP account's options page, there is an Advanced Settings link which enables you to enter the incoming mail server information and set up some "more advanced" things. However, these so called "Advanced" things seem to be always needed to be changed, as the default incoming mail server that Hotmail tries is it seems "pop.yourmailaddressdomain.com" something that is most times wrong. Some ISPs use "mail.yourmailaddress.com" as the incoming server. Will an ordinary user know that they would have to click on the "Advanced" link? Obviously not.

    5. Plus, when you press this link, the whole options page refreshes, a very bad experience. The link should have simply expanded the extra options in place, without refreshing the whole page.

    6. Finally, the worst part of it all: When you finish setting up the POP account's settings, there is no confermation page telling you that mail retrieval is now working properly. The only thing it says that you have set up the POP account now and you can check the status by going to another page. This is unacceptable. It should instead provide a conforting message or an error and not give you some vague message that the settings are now saved and ask you to click on another page if you want to check the status, which I am not sure if it even works.

    7. After the set up had been completed, when going through the Inbox, I found the following strange e-mail waiting: "Please verify your e-mail address"

    Yes, that's right. Hotmail had sent an automatic e-mail to the POP account which was retrieved by Hotmail again (Hotmail had retrieved the e-mail it has sent) and then it asked me to click on a link in that e-mail to verify the POP account's e-mail address. But why is this needed? Since Hotmail had retrieved successfully from the POP account the automatic e-mail it had just sent to through it, then why do I have to verify by clicking a link. The verification could have been totally automated if required and behind the scenes. Hotmail could check that it had received back the same e-mail it had sent, completely automatically.

    But in any case, why was a verification even necessary? The fact that one can retrieve e-mails from the POP account is the verification needed to check whether that person in fact owns the POP account's e-mail address. Why is an extra verification needed?

    The above is just one example of many, indicating the Hotmail team's sloppyness. I think that a training course for all the developers and testers on the team is necessary.

  • fxiao
    1 Posts

    The new changes on sharing photos are awful. You can't share with a group of people anymore (category created in contacts), you have to add them one by one and up to 20 people at a time. And the people is being shared with my album have to receive an email notification, that's pretty annoying. Furthermore, the new album shared with friends is no longer showing on their Social Update. PLEASE LISTEN TO US, and bring back the old functionalities!!!!!!!!!!!!!!!!!!!!!!!!! Otherwise, people is moving to other choices

  • Commenting on topic, SkyDrive is a stunning piece of engineering, and as a new user of Windows Live exploring how Windows 7 and Office 2010 work with Live Web Apps, it feels like Christmas every day. My query here is with respect to OneNote, where I essentially live. Please be explicit about how SkyDrive manages password-protected "sections" of OneNote notebooks. I would assume these sections are "allowed" to be shared up onto SkyDrive and they are not indexed at all -- but would like to see this declared (one way or the other).

    Slightly off topic, but not knowing where else to post this, I wish you (Microsoft) had a comprehensive portal (with deep internal document linkages) to help users like me find definitive, comprehensive, current, and accurate information about your (MS) software. It's actually been easier for me to learn about SkyDrive and Live Mesh (and many features of Windows 7 itself) from sites outside your corporate umbrella (such as www.7tutorials.com).  I typically discover "sections" by accident (often Bing search) that have no apparent internal linkage to each other -- like MS Answers, At Home, Connect, Learning, Newsgroups, TechCenter, IT Pro Technet Forum, Marketplace, Media, etc. I understand you may need a simple "At Home" guide to provide broad brush introduction to software, but with deeper linkages at the bottom of these pages, someone like me could easily branch off to find more technical details about requirements and installation and setup (as well as deeper descriptions about how it "works" when it's "working") before "falling" into the forums, which in general discuss problems and resolution.

    To be clear here, I love Windows 7 and OneNote and I am not complaining. Arthur C. Clarke famously wrote that any sufficiently advanced technology is indistinguishable from magic. This is magic, and if it doesn't come with a perfect guide one remains awed. No one ever turned down three wishes, or a wand, or a grail stone because they didn't know how these things really worked.

  • controlz
    145 Posts

    @abm - you can sync your IE favourites between computers you have running Windows Live Mesh 2011.

  • Interesting stuff to know :)

    Now, I would like to provide some feedback:

    1) Why does skydrive search for office files? Suppose, I have many many files (music, images etc) in my skydrive which are hard to find except with the help of " powerful search feature". But when I type the name of file it only searches for office files types. What about other file types? Why are they not included in search results?

    "We should be able to find any file type atleast by its name just like windows search"

    2) This is a general comment. It is good to see great improvements in Hotmail or Skydrive but Windows Live is not only about these two, there are many other products under its Umbrella so what about improving those products also like Groups, Calendar, Alerts etc. and provide better integration among all these. Of Course, this should not come at the expense of Hotmail or Skydrive, these should be top priority but attention Must also be given to other products especially to Groups, Messenger.

  • abm
    268 Posts

    Hi, few points I want to raise for Windows Live:

    1. Ability to synchronize the favorite or bookmarked sites and folders (categories) between SkyDrive and IE, so my favorites are always accessible wherever I go via IE built-in support and Bing bar (if using some other browser or platform). Also provide a simple web-based CRUD app (using MVC, with categories and search) which acts like a webapp for SkyDrive and webservice for IE and BingBar to manage, organize and synchronize these entries with IE (natively) or via BingBar.

    2. In live Hotmail, besides having files, docs, images and emoticon attachments options, please provide us with a feature to attach emails or the entire conversation from any email folder (Inbox, Sent, CustomFolders...) without leaving the page. Like we have this ability in Office Outlook: drag-drop the selected emails in the compose window and they will be attached to the message. Other than the Outlook client, this feature is not yet incorporated in any web based email solution till date. An implicit way is by opening the desired emails individually one-by-one and copy-paste the content, download all the attachments in system and attach them.

    3. While composing the message in Hotmail if you press Spell check, you will get this message at the top "The spelling checker found some words for you to look at. If you write more, check again". Notice that the phrase "check again" is a link, but there is no link to turn off the spell check once its on. Something like "finish checking" or "finish" link would be great to remove the remaining highlights.

    4. When we delete the emails using the sweep the feature, the "Also block the future messages" option doesn't work if any of the sender is in the person's contact list. But there is no notification from the system for this. To avoid it to happen, the system should display the appropriate, friendly success and failure messages at the end of the XMLHttpRequest. Lets say:

                - Sweep operation completed successfully!

                - Apologies, sweep operation was not successful. Please try again later!

                - Sweep operation completed successfully, and the future messages from the selected senders would be blocked!

                - Sweep operation completed successfully, but the following sender couldn't be blocked because they are in your contact list:

                               [checkbox] "Guy 1 Name" someGuy1@email.com

                               [checkbox] "Guy 2 Name" someGuy2@gmail.com

                     Select to remove the contacts and try again

    Thank you!