Site discussion
Site Updates 2014/5/18
- Changed the site logo. I'm not actually sure of the origins of this logo -- I know Our Pants uses it, but I'm not sure where it came from. Can anyone fill me in on that?
- Fixed a bug where non-English transcripts couldn't be marked comlete.
- Admins can now approve/reject words submitted for the spellchecker. There's also an option for "lowercase": this will convert the word to lowercase before adding it to the dictionary. Use this for words that are submitted as Capitalized but are fine lowercase as well.
- Admins can now undelete articles. This is under the Admin:Articles section, and the articles are listed by when they were deleted.
- There was some downtime tonight as I did some server maintenance (upgrading the OS, etc.). This went well enough.
After all the promotion from FYNF/John/Hank, we've had a huge influx in visitors and a lot of progress has been made in transcribing, so thank you to all the new users that have started contributing, and also the old users who've helped get the site to where it is today.
I think vondell swain made it. Maybe it was vihart. I don't know really.
Vondell did indeed make the logo. Vihart made the one that's the skull and nerdfighter hands.
I wish there was a way for me to easily convert the SRT files that I get from subtitl.us to a transcript format that's easy to put on the wiki, because I would. But I can't find a good way to strip out all of the time codes.
Unfortunately, as Kyle said, I can't really do anything with the translations that are on the wiki, can't make them caption files because they lack the timing info that in necessary.
Subtitl.us does make it easier for people to time code than doing it from scratch though; it gives users a text file with the timecodes and English text, then users just have to translate line by line and leave the timecodes as is. If that makes sense.
If anyone can find a way to either extract caption tracks and put them on the wiki, or make it so it's easy to post srts on here, I would love to get the translations that are on subtitl.us on the wiki as well.
Hmm, clearly I don't understand the way threads work on here, oops!
This is a question I saw on a language discussion page, that I found I couldn't answer: What exactly is our relationship with subtitl.us? The FYNF announcement read:
"Similarly, if you are multilingual, we could use your talents in helping to translate videos for our educational channels. See below for a full list of courses and channels that are available for translation... Instructions regarding the translation process are available on each individual site."
I haven't been able to find these directions... should translators take the English scripts from subtitle.us or from the Wiki?
It's all very confusing. Basically, to be used as closed captions, non-English transcripts need to be timed. The wiki doesn't exactly support timed transcripts very well, subtitl.us only supports timed transcripts. But timed transcripts are a much bigger task, so I haven't wanted to tell translators that they must do timed-only.
So our relationship is kind of... slightly symbiotic coexistence? Personally I like having the translations available on the wiki, but in the end (at least for now) subtitl.us is more useful for non-English transcripts.
If you're translating a video, it should be best to use subtitl.us script. That saves a lot of time since you don't have to worry about the timing info as much.
I was thinking it would be good if on the spellcheck page where it lists word, user, accept, reject and lowercase there could also be a colum saying which channel it came from or a link to the transcript or a place to preview the section of transcript that it came from? I'm not sure which one would be easier to code.
My thinking is that sometimes words which don't seem to make sense could be easier to review if we knew that they came from an educational channel that frequently uses scientific, medical or international words. Or, if a preview would be easier to code, being able to see the word in context might also help.
I'll see what I can do for that. I'm not sure if I'll be able to include the snippet, but I can at least get some other info in there.
Should notes like this: ["Crazy" by Gnarls Barkley is heard playing in the background] be included in transcripts or not?
Yeah, that's fine. You don't need to include it, but if you feel it's important enough to note, go ahead.
Just found this great tumblr devoted to starting a volunteer and viewer-powered movement to transcribe videos from a bunch of different YouTubers.
Might be great to get in touch and share ideas/resources/etc.
http://youtube-captions.tumblr.com/
https://www.youtube.com/user/Anmlwndrs! Whoo!
Site Updates 2014/5/2
The new version of the wiki is now live. The domain changes are still propogating, so it could take up to 24 hours to see the new site, but the change just took effect for me.
Major Changes
- Spell checking. All transcripts are automatically spell checked when they're saved. The spell checker uses US English with a custom wordlist of nerdfighteria terms, so words like "nerdfighter" aren't flagged as misspelled. You can also check the spelling by clicking a button on the toolbar. Transcript misspellings can be fixed in bulk by clicking Proofread at the top. I still need to run all existing transcripts through the spell checker, so that's a work in progress.
- New text editor. I've converted the site from CKEditor to TinyMCE. This is the editor used on discussion pages, transcript editing, and wiki editing. This'll make future improvements easier. From my testing, all previous functionality (keyboard shortcuts, etc.) should still work correctly.
- New host. The site is no longer on a shared host. I've moved it to its own server, which gives a lot more flexibility for future development.
- Calendar. You can now view current videos and scheduled videos in a calendar format.
- Transcript priority. All transcripts have a priority (a number from 0 to 100) to indicate how much we need it transcribed or translated. This priority is set by admins. Click Transcribe at the top to view the top priorities. Click Translate to view the priorities for different languages.
Minor Changes
- Implemented DataTables on the Video Statistics page for channels. This will give the all the same stats in a paged, sortable display.
- Implemented new dropdowns on the wiki article editor. This has an autocomplete/search thing to make selections easier.
- On the schedules page, schedules are now sorted by the video priority of that channel. That is, higher priority channels are listed first.
- You can now expand nodes on the wiki article nav without going to that article.
- The wiki article tree now groups all articles with child articles at the top.
- Video statistics are resync'd from Youtube once a week. It does this in chunks of 20 or so videos every few minutes, and scales with the number of videos to ensure every video is updated once a week.
- Users are now asked if they'd like to spellcheck a transcript before marking it as complete.
- Users who sign up for a schedule are now notified via email when a video is uploaded to that schedule. Currently does not work for category-based schedules.
Cosmetic Changes
- Lots of miscellaneous style changes throughouto the site.
- Renamed "wiki entries" to simply "articles."
- Removed the home link from the menu. The banner text now links to the homepage.
- Renamed "Channels" to "Videos" in the menu.
- Reorganized the channels to add a "cultural" group.
- Reversed the direction of the channel groups, so that Main/Educational are closer to the mouse when opening the menu.
- Renamed the user nav (the tabs at the top) to be more intuitive, based on what you'd like to do.
- Slightly updated the Getting Started guide and the homepage message.
- The article history tab now displays as a table.
- The preferences/messages/logout links have been moved to a dropdown menu.
Admin-only changes
- Admins can now set the priority of channels and videos.
- Admins can lock or unlock a transcript from editing.
- Admins can flag a video as having captions enabled or not.
- Admins can edit the custom spellcheck dictionary.
- Admins can add transcripts to the newly-created Transcript Queue. This allows admins to preload video transcripts if they have them in advance. When the next video with the given video ID or on the given channel is uploaded, that transcript is automatically added to the video.
- Admins can view the status of the various automated jobs. Currently only one job is listed, though.
- Admins can force a resync of the videos for a channel.
- Admins can force a resync of the info fro a channel.
- Admins can subscribe to a channel to receive notifications when a transcript is completed.
That should be everything. A lot of things are still works-in-progress, and in the transition I'm sure various things broke. So I'll be spending a lot of time in the near future developing the new features some more and fixing anything that broke along the way.
Let me know here, via PM, or via email if you notice any issues, have further suggestions, want to complain about the changes, etc.
Complain!? This all looks gorgeous!
Though would it be possible to further explain the Transcript Queue and Automated Jobs? Thanks.
Thanks!
- Automated jobs. Right now there are four or five scripts that run on a schedule (grabbing new videos from youtube, updating channel information, resyncing vid stats, etc.). The admin section is an interface for monitoring those. It likely won't be much use to anyone but me, though.
- Transcript queue. This is a feature I added for Valerie, who will be working more with the wiki so she can get captions up on the videos. She'll be able to preload the scripts for certain videos (particularly Crash Course) so transcribing will be much easier. The queue is to make preloading these scripts easier.
My DNS annoyingly decided to revert to the old settings, so I've been waiting for the past couple hours for it to go back to the new site. Gah!
Also, I ran the spellchecker on all existing transcripts. There a whole bunch of words that should get added to the dictionary, but other than that, it seems to have gone well enough.
Speaking of which, I've noticed the Crash Course (and maybe other channels) scripts that Valerie puts aren't marked as complete - is this because we should paragraph them first?
Also, how do we deal with on-screen text? The fun facts in Crash Course, for example - there have been a few times where I wanted to search for a certain text I know was on-screen (but maybe it's just me)...
Valerie is supplying the scripts, but once they're here, we need to make sure the script matches the video (there's always the chance they deviated from the script at some point). It should also be reformatted as needed, and maybe spellchecked. At that point I think they can be marked complete.
She'll be taking the completed transcripts and uploading them as captions on the videos.
For onscreen text, that's a good question. I'd say don't include it for now, but it'd be good to look into trying to add them in.
Is locking the videos a priority or should they just be left as complete unless someone starts spamming them? If they are locked is this a signal that they are ready to be used as captions or does marking a transcript as complete do the same job as locking?
Right now, just marking it as complete indicates that it can be used as captions, so locking them isn't necessary for that.
I don't think locking needs to be a priority. If you're going through old transcripts and fixing spellings / formatting the text / etc., you can go ahead and lock it afterwards. That makes it easier to keep track of which videos haven't been finished up / reviewed. So I guess, feel free to do that if you'd like, as it does serve a purpose beyond preventing spam. But there's not an immediate rush to get that done.
I have stats for the wiki itself, but that's unfortunately not very representative of all vlogbrothers viewers. This video might help, though: http://www.youtube.com/watch?v=miRaX5gtb4o
There's been a lot of development on the new site. I have a few more pieces to code, and then I think it should be ready to go. The current plan is that the current site will go offline this Friday or Saturday.
There will be a few hours of downtime as I transfer the content, point the domain to the new server, and do some general testing. I'll put a notice on the homepage the day of.